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Abstract 

The promoter selectivity of Escherichia coli RNA polymerase is determined by the sigma subunit with promoter recognition 
activity. The model prokaryote Escherichia coli contains seven species of the sigma subunit, each recognizing a specific set 
of promoters. The major sigma subunit, sigma-70 encoded by rpoD, plays a major role in transcription of growth-related 
genes. Concomitant with the increase in detection of promoters functioning in vivo under various stressful conditions, the 
variation is expanding in the consensus sequence of RpoD promoters. In order to identify the canonical sequence of 
"constitutive promoters" that are recognized by the RNA polymerase holoenzyme containing RpoD sigma in the absence of 
supporting transcription factors, an in vitro mixed transcription assay was carried out using a whole set of variant promoters, 
each harboring one base replacement, within the model promoter with the conserved -35 and -10 sequences of RpoD 
promoters. The consensus sequences, TTGACA(-35) and TATAAT(-IO), were identified to be ideal for the maximum level of 
open complex formation and the highest rate of promoter opening, respectively. For identification of the full range of 
constitutive promoters on the £ coli genome, a total of 2,701 RpoD holoenzyme-binding sites were identified by Genomic 
SELEX screening, and using the reconfirmed consensus promoter sequence, a total of maximum 669 constitutive promoters 
were identified, implying that the majority of hitherto identified promoters represents the TF-dependent "inducible 
promoters". One unique feature of the constitutive promoters is the high level of promoter sequence conservation, about 
85% carrying five-out-of-six agreements with -35 or -1 0 consensus sequence. The list of constitutive promoters provides the 
community resource toward estimation of the inducible promoters that operate under various stressful conditions in nature. 
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Introduction 

The bacterial RNA polymerase core enzyme with the subunit 
structure oc fSR'co is fully active in catalysis of RNA polymerization 
but is unable to initiate transcription from promoters. Transcrip- 
tion initiation from gene promoters requires an additional 
dissociable sigma subunit, which reversibly associates with the 
core enzyme to form the holoenzyme, and directs the core enzyme 
to recognize promoters for transcription initiation. Most bacteria 
encode multiple species of the sigma factor [1,2]. In Escherichia coli, 
seven species of the sigma subunit exist, each recognizing a specific 
set of promoters [1,3]. The intracellular levels of seven sigma 
subunits vary depending on cell growth conditions [3,4]. Sigma 
replacement is a simple mechanism of switching of the pattern of 
genome transcription [4,5], and the intracellular concentration of 
seven sigma factors is therefore a primary determinant of the 
pattern of genome transcription. 

The sigma-70, encoded by the rpoD gene, is the primary and 
major sigma, which is responsible for transcription of most of the 
genes that are expressed in exponentially growing phase of E. coli 
[1,2,6,7]. The holoenzyme containing RpoD sigma recognizes in 



vitro a pair of hexanucleotide sequence elements, TTGACA (-35) 
and TATAAT (- 1 0) which are situated at 1 0 and 35 bp upstream, 
respectively, of transcription initiation sites [2, 6, 8. 9], and a 
spacer DNA of approximately 17 bp in length separates these two 
hexanucleotide sequences. This consensus sequence of RpoD 
promoters was originally proposed based on in vitro transcription 
assays of some model templates by purified RNA polymerase 
[10,11]. Later the promoter sequences were determined for a 
variety of functioning promoters in vivo, one by one, by using 
ordinary molecular genetic approaches [12,13]. More recendy 
high-throughput experimental systems such as ChlP-chip analysis 
of RNA polymerase-bound DNA sequences [14,15] andRNA-Seq 
analysis of whole sets of transcripts [16-18] have been employed 
for searching functioning promoters in vivo. In parallel, computa- 
tional approaches have been employed to identify promoters 
relying on the consensus sequences predicted based on the known 
promoters. Up to the present time, more than 2,000 experimen- 
tally defined or computational predicted promoters have been 
identified, which are under the direct control of RpoD sigma (see 
the promoter databases such as RegulonDB [19] and EcoCyc 
[20]). As a result, the striking diversity appeared in the canonical 
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promoter sequence, mainly arising from the detection of weak 
promoters, which are activated only in the presence of positive 
transcription factors (TFs). 

The constitutive promoters have been defined as those that are 
active in vivo in all circumstances, but it is practically impossible to 
identify the whole set of constitutive promoters under various 
environmental conditions. Instead we propose in this study to 
define the "constitutive promoter" as the promoters that are 
recognized in vitro by RNA polymerase RpoD holoenzyme alone in 
the absence of additional supporting proteins with regulatory 
functions. For the identification of the whole set of constitutive 
promoters on the E. coli genome, we performed the Genomic 
SELEX screening system of genome DNA sequences recognized 
by the RNA polymerase holoenzyme containing RpoD sigma. 
The Genomic SELEX system was developed to identify DNA 
sequences recognized in vitro by DNA-binding transcription factors 
[21] and successfully applied for the identification of regulation 
targets of many TFs [22]. After SELEX-chip screening, a total of 
2,071 sites were identified on the E. coli genome, which are 
recognized by the RpoD holoenzyme alone. The location of 
constitutive promoters within these RpoD holoenzyme-binding 
regions was then computationally identified using the consensus 
sequence, which was experimentally determined in this study using 
the in vitro mixed transcription assay [23-25]. The total number of 
constitutive promoters on the entire E. coli genome was thus 
estimated to be between minimum 492 and maximum 669. This 
number of constitutive promoters represents only about one-forth 
of the hitherto identified promoters on the E. coli genome, 
indicating that the rest of promoters listed in the promoter 
databases represents the "inducible promoters", which are 
activated in the presence of supporting TFs. One unique feature 
of the constitutive promoters is the high-level conservation of 
consensus sequences, TTGACA(-35) and TATAAT(-IO), each 
being separated by a 1 7-bp spacer. 

In the absence of regulatory proteins with repression activity, 
the constitutive promoters must be always expressed. Based on the 
Genomic SELEX screening of the binding sites of nucleoid protein 
H-NS along the E. coli genome, we also propose that the H-NS 
plays a major role in silencing of the unnecessary expression of 
constitutive promoters. 

Results 

Genomic SELEX screening of RpoD holoenzyme-binding 
sequences on the £ coli genome 

The constitutive promoters are transcribed in vitro by the RNA 
polymerase RpoD holoenzyme alone in the absence of supporting 
TFs. In order to identify the whole set of constitutive promoters on 
the entire genome of E. coli K-12 M3110, we first performed a 
mass-screening in vitro of the whole set of sequences that are 
recognized by the reconstituted RpoD holoenzyme. For this 
purpose, we prepared sigma-free core enzyme by passing the 
purified RNA polymerase three times through phosphocellulose 
column chromatography in the presence of 5% glycerol (note that 
sigma-core interaction becomes stronger in the presence of 
increasing glycerol concentration as used for prolonged storage 
of the holoenzyme [26]). The level of remaining sigma subunits 
was less than 0. 1 %, if any, as detected by both protein staining and 
immuno-staining with antibodies against each of all seven species 
of E. coli sigma subunits (RpoD, RpoN, RpoS, RpoH, RpoF, 
RpoE and Feci) (data not shown). The stoichiometry between core 
enzyme subunits was also checked by immuno-staining with 
antibodies against the core subunits, RpoA, RpoB, RpoB and 
RpoZ. The RpoD holoenzyme fully saturated with RpoD sigma 



was reconstituted by mixing this sigma-free core enzyme and 4- 
fold molar excess of purified RpoD sigma, which alone does not 
bind to DNA. 

For the identification of DNA sequences that are recognized by 
RpoD holoenzyme, we employed the Genomic SELEX screening 
system [2 1] , in which a library of E. coli genome DNA fragments of 
200-300 bp in length was used instead of synthetic oligonucleo- 
tides with all possible sequences used in the original SELEX 
method [27-29]. The multi-copy plasmid library of 200-300 bp- 
long random DNA fragments was constructed from the E. coli K- 
12 W31 10 genome [21]. The library used in this study contained 
7-fold molar excess of the entire genome, and thus a single and the 
same sequence might be included in 7 different overlapping 
segments on average, thereby increasing the resolution of mapping 
of SELEX fragments. In each experiment of Genomic SELEX 
screening, the mixture of genome DNA fragments, which was 
regenerated by PCR from the genome DNA library, was mixed 
with 2-fold molar excess of the reconstituted RpoD RNA 
polymerase holoenzyme, and subjected to Genomic SELEX 
screening. DNA-holoenzyme complexes formed were recovered 
using the anti-RpoC antibody, which gave the highest level of 
RNA polymerase recovery among all the anti-core subunits. RNA 
polymerase-associated DNA was isolated from the antibody 
precipitates, amplified by PCR, and subjected to next cycles of 
SELEX. After two-cycles of SELEX screening, the final products 
of RpoD holoenzyme-bound DNA fragments were subjected to 
mapping on the genome using a DNA tilling microarray (Oxford 
Gene Technology, Oxford, UK) [30-32]. On the DNA tilling 
array used, the 60 b-long DNA probes are aligned at 105 bp- 
intervals in the order of E. coli genome sequence, and therefore 
approximately 300 bp-long SELEX fragments should bind to two 
or more consecutive probes. This criterion was employed to avoid 
the background noise of non-specific binding of RpoD holoen- 
zyme-bound DNA fragments to the tilling array. 

The sequences with binding affinity to the RpoD holoenzme 
formed a number of peaks along the entire E. coli genome. By 
setting the cut-off level of 2.0% relative to the highest peak located 
within a spacer upstream of ssrA (SsrA smRNA) and downstream 
of smpB (SmpB trans-translation factor), a total of 2,701 RpoD 
homolenzyme-binding peaks were identified, of which 1,075 (40%) 
are located within 543 intergenic spacers (average 1.98 sites for 
each spacer) (Fig. 1A). On the other hand, a total of 1,626 (60%) 
peaks are located inside of 777 open reading frames (average 2.09 
peaks per gene) (Fig. 1A). Since the majority of hitherto identified 
promoters are located within spacers, detailed search for the 
constitutive promoters was focused on the total of 1,075 peaks 
within 543 spacers. 

Location of the constitutive promoters within the £ coli 
genome 

The spacers containing RpoD holoenzyme-binding sites can be 
classified into three types (Fig. 1A): 404 peaks are located within 
177 type-A spacer between bidirectional transcription units 
(Table 1 for the whole list; see also Table SI); 583 peaks are 
located within 315 type-B spacers located upstream of one 
transcription unit but downstream of another transcription unit 
(Tables 2 and 3 for the whole list; see also Table S2); and 87 peaks 
are located within type-C spacers downstream of both transcrip- 
tion units of type-C spacers. Based on the transcription direction of 
flanking genes, the total number of constitutive promoters was 
predicted to range between minimum 492 (177 A-type plus 315 B- 
type) and maximum 669 (177x2 A-type plus 315 B-type). 

Type-A spacers should contain at least two promoters for 
bidirectional transcription. The RpoD holoenzyme-binding sites 
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| Distribution of RpoD holoenzyme-binding sites 
(Within Spacers) (On Genes) 

1,075 sites (40%) | 1,626 sites (60%) | (Total, 2,701) 
(543 spacers) 



Type-A 403 
Type-B 585 



(177 spacers) 




(154 spacers) 
(161 spacers) 

(49 spacers) 



Classification of Escherichia coli promoters 
Constitutive promoters + Inducible promoters 



| Total, 2,082 (Promoter database) 




Constitutive promoters (SELEX) 


I 

1 --. 

i 


Type-A 


Type-B | (Total, 492-669) 



177-354 



315 



Figure 1. Distribution and classification of the constitutive 
promoters. Genomic SELEX search of RpoD holoenzyme-binding 
sequences was performed using the standard procedure [21]. RpoD 
holoenzyme-bound DNA fragments were recovered by immunoprecip- 
itation using anti-RpoC antibody. SELEX fragments were isolated from 
the immuno-precipitates and subjected to mapping on the E. coli 
genome by using tilling DNA microarray as described previously [32,33]. 

[A] Location of the constitutive promoters. A total of 2,701 RpoD 
holoenzyme-binding sites were identified (see Fig. 1), of which 1,075 
(40%) are located within intergenic spacers. On the basis of 
transcription direction of flanking genes, the spacers were classified 
into three types: type-A between bidirectional transcription units; type- 
B upstream of one transcription unit but downstream of another 
transcription unit; and type-C, downstream of both transcription units. 

[B] Classification of the constitutive promoters. A total of 2,082 
promoters have been identified and listed in the current versions of 
RegulonDB and EcoCyc databases, whereas the total number of 
constitutive promoters identified by Genomix SELEX screening ranges 
between minimum 492 and maximum 669, indicating that the majority 
of £ coli promoters listed in promoter database are TF-dependent 
inducible promoters. 

doi:1 0.1 371 /journal.pone.0090447.g001 

identified in a total of 177 type-A spacers should represent 
promoters for one or both of bidirectional transcription. Close 
observation of the SELEX-chip pattern indicates that two RpoD 
holoenzyme-binding sites can be identified if the spacer is longer 
than 500 bp in length (Fig. 2). For instance, two peaks of RpoD 
holoenzyme binding were identified within a single and the same 
Type-A spacer between 755 bp-long csgD-csgB and between 
1458 bp-long nanC-fimB (Fig. 2). Generally this-group promoters 
associated with stress-response genes are located within long 
spacers including the binding sites of a number of TFs such as in 
the spacers of 755 bp-long csgD-csgB, 920 bp-long IrhA-alaA and 
1456 bp-long nanC-fimB. One typical example is the promoter for 
the csgD gene encoding the master regulator of biofilm formation, 
which is under the control of more than 20 TFs [23,33]. 

The binding of RpoD holoenzyme was identified in a total of 
315 type-B spacers (Fig. 1A). The binding of RpoD holoenzyme 
alone to type-B spacers represents the presence of at least one 
constitutive promoter for one direction transcription. Even for this 
group of constitutive promoters, more than two RpoD holoen- 



zyme-binding peaks were identified for some spacers (Fig. 3), 
indicating the presence of multiple promoters for one and the 
same transcription units such as the cydA promoters within 847 bp 
spacer and the yobF promoters within 670 bp spacer. In good 
agreement with the presence of multiple peaks for the mngB-cydA 
type-B spacer, five promoters have been identified for the cydAB 
operon encoding cytochrome bd-l terminal oxidase [34—36], of 
which at least two may be the constitutive promoters that function 
in the absence of activator TF. The collection of constitutive 
promoters within type-B spacers also includes a total of 40 internal 
promoters located within intergenic spacers of single operons 
(indicted by symbol "Int" in P column of Tables 2 and 3; and 
Table S2). These internal promoters might play physiological roles 
under as yet unidentified circumstances. In fact a constitutive 
internal promoter within the rplKAJL-rpoBC operon has been 
identified [37], which should contribute the expression level 
control between four ribosomal proteins (LI 1, LI, L10 and LI 2) 
and two RNA polymerase subunits (RpoB and RpoC). 

The total number of RpoD promoters (or the transcription 
initiation sites) listed in the current databases (RegulonDB and 
EcoCyc) are as many as 2,082, indicating that the majority of 
known promoters represent TF-dependent "inducible promoters" 
that are expressed only under the support of positive regulatory 
proteins (Fig. IB). Among the total of 2,082 RpoD promoters (or 
transcription initiation sites) listed in the current version of 
promoter databases, the promoter sequences are described for a 
total of 582, of which 434 (75%) are located within the same 
spacers that contain the constitutive promoters (255 in type-A 
spacers and 1 79 in type-B spacers) (shown under black background 
of P columns in Tables 1, 2 and 3). 

Identification of the consensus sequence of constitutive 
promoters recognized by the RpoD holoenzyme 

For identification of the constitutive promoters within type-A 
and type-B spacers with the binding sites of RpoD holoenzyme, we 
performed in silico search using the consensus sequence of 
constitutive promoters. The current databases of E. coli promoters 
include both experimentally identified and computationally 
predicted promoters. The prediction of promoters has been 
performed using the canonical promoter, TTGACA-17 bp- 
TATAAT, which was originally identified using the in vitro 
transcription studies [10,11,38]. We then tried to identify the 
consensus sequence of constitutive promoters recognized in vitro by 
RpoD holoenzyme alone. 

In order to experimentally confirm the consensus sequence 
recognized by RNA polymerase RpoD holoenzyme, we first 
constructed an ideal promoter with the complete set of consensus 
TTGACA (-35) and TATAAT (-10) sequences, separated by a 
spacer of 17 bp in length, starting from the lacUV5 promoter [23]. 
To identify the best RpoD promoter giving the highest activity and 
to examine the role of individual bases within two hexanucleotide 
sequences, we then constructed a total of 48 variant consensus 
promoters, each carrying one base replacement at each position of 
both -35 and -10 signals. For accurate measurement of the RNA 
product directed by each variant promoter, we employed the in 
vitro mixed transcription system [24,25], in which transcription of 
each variant promoter was carried out in the simultaneous 
presence of the ideal promoter with the complete consensus 
sequence added as an internal reference. The test promoter 
directed the synthesis of 42 b-long run-off transcript while the ideal 
promoter directed the synthesis of 22 b-long run-off transcript 
[23]. The same amounts of two promoter fragments were mixed 
and incubated with 10-fold molar excess of RpoD holoenzyme for 
various time periods to allow the formation of open complexes, 
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Table 1. Constitutive Promoters (Type-A Spacers). 



Map 


Promoter sequence 


Right Operon 


Left Gene 


D 


RpoD 


D 


Right Gene 


Left Operon 


Promoter sequence 


0.87 


T T AAC G - 1 6 - A AT AAT 


caiTABCDE 


caiT 


< 


HNS 


> 


fixA 


fixABCX 


GTGACA-1 7-TAAAGT 


1.82 


TTGATT-18-TGAAAT 


leuLABCD 


leuL 


< 




> 


leuO* 


leuO 


TTCGCA- 1 7 -TATTAT 


2.43 


TTGTCT-18-CATAAA 


coaED 


coaE 


< 




> 


guaC 


guaC 


TTTATA- 1 8 -GATTAT 


2.99 


ATGACG- 1 8 - 771 TAA T 


gcd 


gcd 


< 


HNS 


> 


hpt 


hpt 


TTAATA-18-TATAGG 


3.06 


TTTAAA- 1 7 -TAT ATT 


can 


can 


< 




> 


yadG 


yadGH 




5.16 


ATAACA- 17- GAT AT T 


yafV 


yafV 


< 




> 


ivy 


ivy 


TTGGAA- 1 7 -TATCGT 


5.19 


TTGCTA- 1 7 -TACAAC 


fadE 


fadE 


< 




> 


IpcA 


IpcA 


CTGACC-17-TGTAGT 


5.31 


TTGTCG-17 -TACAAT 


dinJ-yafQ 


dinJ 


< 


HNS 


> 


yafL 


yafL 


TTTACA-1 7-TATGTT 


5.57 


TTAAGA- 1 7 - TATATC 


phoE 


phoE 


< 


HNS 


> 


proB 


proBA 


TTGTTA- 1 7 -AATAAT 


5.99 


CTGCCA-1 6-TATGAT 


insAB-2 


insA2 


< 




> 


insl-1 


insl-1 




6.04 


TCTACA-17-TTTATT 


yagA—ykgN 


yagA* 


< 




> 


yagE 


yagEF 


TTTACC-1 6 -CATAGT 


6.22 


TTGTGA-16 - A AT AAT 


argF 


argF 


< 




> 


ykgS 


ykgS 


T TGAAT - 1 8 -TGTAAT 


6.25 


CTGCCA-1 6-TATGAT 


insAB-3 


insA3 


< 




> 


yagj 


yagj 




6.36 


TTTATA-16 - T AT AGC 


intF 


intF 


< 




> 


ptwF 


ptwF 


TTGCTA- 1 8 -TATAAA 


7.07 


TTGAAC- 1 7 -TTTAAT 


betlBA 


betl* 


< 


HNS 


> 


betT 


betT 


TGGACG-1 7 -CTTAAT 


7.43 




yahN 


yahN 


< 




> 


yahO 


yahO 




8.26 


7T67IC7I-18-TACAAT 


yaiS 


yaiS 


< 


HNS 


> 


tauA 


tauABCD 


TTG AG A- 1 6-TACAAT 


11.08 




qmcA-ybbJ 


qmcA 


< 




> 


ybbL 


ybbLM 




12.53 


TTGATT- 1 7 -TAT A AC 


ybcY 


ybcY 


< 


HNS 


> 


tfaX 


tfaX 


ATGGCA-15-TTAAAT 


13.05 


TTGCCT-17-TACCAT 


ybdK 


ybdK 


< 




> 


sokE 


sokE 


ATGGCA- 1 6 -AAGAAT 


13.14 


TTGATT - 1 8 -TATTAT 


fepA 


fepA 


< 


HNS 


> 


fes 


fes- -entFE 


AAGACA- 1 7 - AATAAT 


13.37 


TCGATA- 1 7 -TATCAT 


fepDGC 


fepD 


< 




> 


entS 


entS 


ATGAAA- 1 7 -TCTTAT 


13.42 


ATGATA- 1 7 -TATCAT 


fepB 


fepB 


< 




> 


entC 


entCEBAH 


ATGATA- 1 7 -TATCAT 


13.73 


ATGACA-1 6-TTTACT 


dsbG 


dsbG 


< 




> 


ahpC 


ahpCF 


TTGCCC-1 7 -TGTAAT 


14.09 


ATGACA- 1 7 -AATTAT 


dcuC 


dcuC 


< 


HNS 


> 


pagP 


pagP 


TTAAGA- 17- T AAAAA 


14.54 


CTGACG- 1 7 -AATAAG 


ybeQ 


ybeQ 


< 


HNS 


> 


ybeR 


ybeR-djiB 




15.13 


TTTACT-17-TATTTT 


nagB 


nagB 


< 




> 


nagE 


nagE 


GTGACA-1 9-TTT AAT 


15.69 


ATGAAA- 1 6 -TGTAAT 


kdpFABC 


kdpF 


< 


HNS 


> 


ybfA 


ybfA 


TTCGCA- 1 7 -TGTAAA 


16.22 


TTGA CA- 1 8-TAC AAA 


git A 


gltA 


< 


HNS 


> 


sdhC 


sdhCDAB 


TTGTAA- 1 7 -TATACT 


16.46 


TTGAAA-1 8-TATTAT 


mngR 


mngR* 


< 


HNS 


> 


mngA 


mngAB 


T T AATA- 1 9 - GAT AAA 


17.93 


CTGAAA-18-TATTGT 


ybiA 


ybiA 


< 




> 


dinG 


dinG 


ATGCCA- 1 6-TACAGT 


18.29 


T T C AAA- 1 8 - T AT ATC 


rhtA 


rhtA 


< 


HNS 


> 


ompX 


ompX 


TTGACT-1 8-TGGAAT 


18.37 


TTAACG-1 6 - TATAAT 


mntS 


mntS 


< 




> 


mntR* 


mntR-ybiR 


ATTACA- 1 8 -TATATT 


19.13 


TTGATA-1 9-TAAAAT 


ybJL 


ybjL 


< 


HNS 


> 


ybJM 


ybJM 


TTGAAA-1 5-GATAAT 


21.47 


ATGACA-1 7-AATAAT 


ssuEADCB 


ssuE 


< 




> 


elfA 


elfAD 


T T TAAA- 1 7 -TATTAT 


21.98 


TTAACT-1 6-TATTCT 


su/A 


su/A 


< 


HNS 


> 


sxy 


sxy 


TTGCCC-1 7 -TATTTT 


22.59 


TGGAGA- 1 8 -TACACT 


gfcA 


gfcA 


< 




> 


insA4 


insAB-4 


CTGCCA-1 6-TATGAT 


23.75 


TTATCA-1 7-TAAAAT 


csgDEFG 


csgD* 


< 


HNS 


> 


csgB 


csgBAC 


CTGACA-1 7-TGTAGT 


23.86 


TGGATA- 1 7 - CAGAAT 


mdoC 


mdoC 


< 




> 


mdoG 


mboGH 


GT GAAA- 1 7 - C T T AAT 


24 03 


1 1 HHV.H 1 o 1 rtv_rt 1 1 


WaalVi 


Waal VI 












1 IbLLb 10 MM 1 MM 1 


25.16 


T T T AT A- 1 7 - T AAAAA 


comR 


comR* 


< 


HNS 


> 


bhsA 


bhsA 


TTCACC-17 -AATAAT 


25.79 


TTGTAA-18 - TATAAT 


ymfED 


ymfE 


< 




> 


lit 


lit 


TTGATA-1 5-AAT AAT 


26.15 


TTTGCA- 1 7 - TAAAAC 


bluF 


bluF* 


< 


HNS 


> 


ycgZ 


ycgZ-ymgC 


TTGTCA-1 8-TTTTAT 


26.43 


TTTACC-17-TGGAAT 


pliG 


pliG 


< 




> 


ycgL 


ycgLM 


TTCAGA- 1 8 - 771 TAA T 


26.48 


ATGACA-1 9-TATCAT 


hlyE 


hlyE 


< 




> 


umuD 


umuDC 


CTGGCA-16 -TATAAA 


26.56 


C T G AC C - 1 9 - C AT AAA 


nhaB 


nhaB 


< 




> 


fadR* 


fadR 


TTGATA-1 9-GAAAAT 
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Table 1. Cont. 



Map 


Promoter sequence 


Right Operon 


Left Gene 


D 


RpoD 


D 


Right Gene 


Left Operon 


Promoter sequence 


26.62 


TAGATA- 1 7 - TCTAAT 


ycgB 


ycgB 


< 




> 


dadA 


dadAX 


TTATCA- 1 7 -TATTCT 


26.79 




ycgR 


ycgR 


< 




> 


ymgE 


ymgE 


TTGCCT-1 6-TGTAAG 


26.92 


T T GAAC - 1 6 - C GT AAT 


dhaKLM 


dhaK 


< 




> 


dhaR* 


dhaR 


TTGCGA- 1 7 -CATAAG 


27.10 


TTGTCA-1 7-TAAACT 


pth-ychF 


pth 


< 




> 


ychH 


ychH 


TTGTAA- 1 7 -CATAAC 


27.37 


TCGACA- 1 6-TAAAAT 


chaA 


chaA 


< 




> 


chaB 


chaBC 


TTGACC-1 8-TGTAAA 


27.84 


AT G AAA- 1 7 -TATCAT 


hns 


hns* 


< 




> 


tdk 


tdk 


TTACCA- 1 7 -TATAAC 


27.90 


GTGACG-16 -TCTAAT 


adhE 


adhE 


< 




> 


ychE 


ychE 


T T TAAA- 1 6 -TCTAAT 


29.94 


ATGATA- 1 9 -TCTAAT 


ycjY-ymjDC 


ycjY 


< 




> 


pgrR* 


pgrR 


TAGACC - 1 8 -TATCAT 


31.03 


T T GAAG -17-CATAGT 


IdhA 


IdhA 


< 




> 


ydbH 


ydbH-ydbL 


TTGAAA-1 5-AAAAAT 


31.90 




azoR 


azoR 


< 




> 


hrpA 


hrpA 


TAGAAA- 1 7 -TATATC 


32.07 


TTAACA- 1 9-TTTCAT 


gapC 


gapC 


< 


HNS 


> 


cybB 


cybB 


TTG AG A- 1 9-GAAAAT 


32.16 


TTAATA- 1 7 -TATCAT 


ydcl 


ydcl* 


< 


HNS 


> 


ydcl 


ydcl 


CTGACA-1 6-TATGAT 


32.48 


TAAACA- 1 8 -TGTAAT 


yncJ 


yncJ 


< 




> 


hicA 


hicAB 


TAAACA- 1 9 -TATACT 


33.50 


TTGCCA-1 5-TATAAA 


bdm-sra 


bdm 


< 


HNS 


> 


osmC 


osmC 


TTGATA-1 7-TATATT 


34.73 


TTTAAA-17-TATCTT 


sad 


sad 


< 




> 


yneJ* 


yneJ 


T T TACT - 1 7 - AAT AAT 


34.84 


CTGGCA- 1 7 -GAT AAT 


marC 


marC 


< 




> 


marR* 


marRAB 


TTGACT-1 7-TATTAT 


35.08 


ATGCCA- 1 7 -TTTAAG 


ydfl 


ydfl 


< 




> 


ydfK 


ydfK 


TTCCCA-1 6-GATAAT 


35.43 




relBE-hokD 


relB 


< 




> 


ydfV 


ydfV 




37.33 


TTGCCG- 1 7 -AATAAT 


grxD 


grxD 


< 


HNS 


> 


ydhO 


ydhO 


T T AAC T - 1 8 -TAG AAT 


37.44 


GTGAAA- 1 7 -TAATAT 


ydhB 


ydhB* 


< 




> 


ydhC 


ydhC 


TTCACA-1 7-TACACT 


37.78 


TTGTCT-18-TTTAAT 


ydhZ 


ydhZ 


< 


HNS 


> 


pykF 


pykF 


T T AACT - 1 8 -TATATT 


38.77 


TCGTCA- 1 7 -TAAAAT 


thrS- -infA 


thrS 


< 




> 


arpB 


arpB 


ATGATG- 16 -TATAAC 


41.11 


TTGCGA- 17- TA TAA T 


kdgR 


kdgR* 


< 




> 


yebQ 


yebQ 


TTATCA- 1 7 -TATAAA 


41.39 


TTAACA- 15-TGTG AT 


pphA 


pphA 


< 




> 


ryeA 


ryeA 


ATCACA-18-TAAAAA 


41.57 


TTGATT - 1 7 -TAT ACT 


yebC 


yebC 


< 


HNS 


> 


purT 


purT 


AAGACA- 1 8 -TATACT 


41.81 


ATGAAA- 1 7 -TCTCAT 


znuA 


znuA 


< 




> 


znuC 


znuCB 


ATGAGA-17-TTTCAT 


41.90 


TTGTCA-1 5-TACAAA 


yobl 


yobl 


< 




> 


yebB 


yebB 


CTGAAA- 1 7 -TATTAC 


41.96 


TTTCCA-17-TTTCAT 


aspS 


aspS 


< 




> 


yecD 


yecDEN 


GACACA- 1 9 -TATTAT 


42.61 


TTGAAA-1 7-TATGAC 


insAB-S 


insAS 


< 




> 


uspC 


uspC 


TTGGCA-1 6-TATAAG 


43.11 


CTAACA-1 9-AAAAAT 


fliC 


fliC 


< 




> 


fliD 


fliCST 




43.23 


TTGAAA-1 8-TTTAAT 


yedD 


yedD 


< 




> 


yedE 


yedEF 


TTCAGA- 1 7 -TATTAT 


43.32 


CTGACA- 1 6-ACTAAT 


yedM 


yedM 


< 




> 


intG 


intG 


TTGCCC-16-CACAAT 


43.89 


TTGTAA-16-TATAAC 


yedWV 


yedW* 


< 




> 


hluH 


hluH 


TAGAAA- 1 5 -CAAAAT 


44.99 


CTGACA- 1 6 -ACTAAT 


yefM-yoeB 


yefM* 


< 




> 


hisL 


hisLG-AFI 


TAAACA- 1 9 -TATAAA 


46.28 


TTGTTA- 1 8 -TATAAC 


yegl 


yegl 


< 


HNS 


> 


yegj 


yegj 


ATGATA- 1 7 -TACAAT 


46.68 


TAAACA- 1 8 -TAT ATT 


yegRZ 


yegR 


< 


HNS 


> 


yegS 


yegS 


T T TAAA- 1 7 -TATTAT 


48.01 


T C AAC A- 1 8 - C AT C AT 


dusC 


dusC 


< 




> 


yohJ 


yohJK 




48.30 


ATGAAA- 1 5 -AATAAT 


folE-yeiB 


folE 


< 




> 


yeiG 


yeiG 


CTCACA- 1 7 -TATGAT 


48.42 


TTAAAA-1 9 - TAT AAT 


yeiE 


yeiE* 


< 




> 


yeiH 


yeiH 


TTACCA-15 -TATAAA 


49.59 




napF- -ccmH 


napF 


< 




> 


yojO 


yojO 


TTG AG A- 1 6-CATAAT 


49.78 


ATGAAA- 1 7 -TATCAT 


ompC 


ompC 


< 


HNS 


> 


micF 


micF 


TTAAGA- 1 7 - AATAAG 


50.41 


TTGATC-16 - CAT A AT 


ygaL 


yfaL 


< 


HNS 


> 


nrdA 


nrdAB 




51.81 


CTGCCA-1 6-TATGAT 


IrhA 


IrhA* 


< 




> 


alaA 


alaA 


T TAAGA- 1 7 -TATAAC 


51.96 


ATGACA- 1 7 -CAT CAT 


yfbV 


yfbV 


< 




> 


ackA 


ackA-pta 


CTGAAA-1 6-TAGACT 


52.11 


GTGACA-1 5-TATAGT 


yfcF 


yfcF 


< 




> 


yfcG 


yfcG 


TCGGCA- 1 5 - TATCAA 


53.46 


TTGATA-17-TTTCAT 


emrKV 


emrK 


< 


HNS 


> 


evgA* 


evgAS 


TTGACAA 9-TATATG 
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Map 


Promoter sequence 


Right Operon 


Left Gene 


D 


RpoD 


D 


Right Gene 


Left Operon 


Promoter sequence 


53.73 


TTGCCA- 1 7 -TACTAT 


yfdV 


yfdY 


< 




> 


IpxP 


IpxP 


TCGACA-1 8-AACAAT 


54.02 


TTTACA-17-TATTTT 


glk 


glk 


< 




> 


yfeO 


yfeO 


C TCAC A- 1 7 - CAT AAC 


54.09 


TTGATA- 1 7 -TAGCAT 


mntH 


mntH 


< 




> 


nupC 


nupC 


ATGACT- 1 7 -TTTAAT 


54.40 




yfeR 


yfeR* 


< 




> 


yfeH 


yfeH 


TTCTCA-17-TAGAAC 


54.49 


TTAATA-1 6-GGTAAT 


zipA 


zipA 


< 




> 


cysZ 


cysZK 


TTAACT-19-TGAAAT 


54.96 


TTGGCT-17 -CAT A AT 


ypeA-yfeZ 


ypeA 


< 




> 


amiA 


amiA-hemF 


TTGTCA-1 7-TAAAAC 


55.80 


TTCACC- 1 7 -TAGAAT 


ypfM 


ypfM 


< 


HNS 


> 


yffB 


yffB- -ypfN 


TTCATA- 1 6 -TTTAAT 


55.97 




dapA-bamC 


dapA 


< 




> 


gcvR* 


gcvR 


T T AAAA- 1 8 - T C T G AT 


56.43 


TTGCCG-16 - TATAAT 


upp-uraA 


upp 


< 


HNS 


> 


purM 


purMN 


ATGATA-17-TATTTT 


56.57 


ATGACA-1 8-TATAGT 


yfgF 


yfgF 


< 


HNS 


> 


yfgG 


yfgG 


TTGACC-1 7-CTTAAT 


56.70 




guaBA 


guaB 


< 




> 


xseA 


xseA 


TCGACT-1 6-TAGAAT 


57.34 


TCGGCA- 1 8 -TAAGAT 


trmJ 


trmJ 


< 




> 


suhB 


suhB 


AAGACA- 1 7 -TATACT 


57.81 


TTGATG-16 -TAT AAA 


glyA 


glyA 


< 




> 


hmp 


hmp 


TTTACA-1 6-TATAAG 


58.35 


TAAACA- 1 7 -CATCAT 


rpoE—rseC 


rpoE* 


< 




> 


nadB 


nadB 


TAGACT - 1 6 -TATAAG 


59.01 


TTGAAA-1 6-TATCGT 


aroF-tyrA 


aroF 


< 




> 


yfiL 


yfiL 


TTTTCA- 1 7 -TTTTAT 


59.32 




ratAB 


ratA 


< 




> 


smpB 


smpB 




60.00 


TAGATA- 1 6 -TATCAT 


ileY 


ileY 


< 


HNS 


> 


ygaQ 


ygaQ_ 12 


TTATCA- 1 7 -TGTAAT 


60.24 


TTAAGA- 1 7 -AACAAT 


yqaE 


yqaE 


< 


HNS 


> 


ygaV* 


ygaVF 


TTTAGA-17-AATACT 


60.27 


TTTACA-1 7-TTTAAT 


stpA 


stpA* 


< 


HNS 


> 


alaE 


alaE 


GTGATA- 1 7 -TCTAAT 


60.83 


TTGAAA-1 6-TTTGAT 


mltB 


mltB 


< 


HNS 


> 


srIA 


srIA- -gutM 


TTAACA-1 8-TATGGT 


61.61 


TTCACA-18-TATTTT 


ygbl 


ygbl* 


< 


HNS 


> 


ygbJ 


ygbJK 


TTCACA-1 6-GTTAAT 


62.45 


7TG/101-16-TGTGAT 


ygcW 


ygcW 


< 


HNS 


> 


yqcE 


yqcE-ygcE 


TTCTCA-1 8 -GATAAT 


62.56 


TTTAAA-17-GTTAAT 


queE 


queE 


< 


HNS 


> 


yqcG 


ygcG 


TTAACA-1 8 -GATAAT 


63.94 


TTGACG-1 8-TTTAGT 


rppH-ptsP 


rppH 


< 




> 


mutH 


mutH 


TCGGCA-1 8 -TTTAAT 


65.80 




ygfB- -visC 


ygfB 


< 




> 


zapA 


zapA 


TTGTCT-1 7 -TATAGT 


68.75 


TAGAGA- 1 9 -TTTAAT 


glgS 


glgS 


< 


HNS 


> 


yqiJ 


yqiJK 


T T T AAA- 1 5 -TATATT 


69.27 


TTAACA-18-TTTTAT 


yqjH 


yqjH 


< 




> 


yqjl* 


y<v> 


TTGCAA-1 6-TATAAA 


69.31 


TTGATC- 1 8 -TATAGT 


aer 


aer 


< 


HNS 


> 


patA 


ygjG 


TAAACA- 1 9 -TAAAAT 


69.68 


TTTTCA- 1 8 -TATCAT 


rlmG 


rlmG 


< 




> 


ygjP 


ygjP 


TTGCCC-1 8 -TATACC 


70.35 


TTGATA- 1 6-TGTAAT 


trcA- -G 


tdcA* 


< 


HNS 


> 


tdcR 


tdcR 


TTTAAA- 1 6 -TATAAA 


71.48 


TTGACAA B-TATAAT 


metY- -pnp 


metY 


< 




> 


argG 


argG 


ATGAAA- 1 7 -AAAAAT 


71.93 


T G G AC T - 1 6 - T AAAAC 


mlaFEDCB 


mlaF 


< 




> 


yrbG 


yrbG- -IptAB 


T T T AC T - 1 7 - C AAAAT 


72.51 


T T AC C A- 1 6 - C AT AAA 


insH-10 


insHIO 


< 




> 


yhcF 


yfcF 


CTCACA-18-TTTAAG 


72.99 


ATAACA- 1 8 -TAT ATT 


aaeXAB 


aaeX 


< 


HNS 


> 


aaeR* 


aaeR 


TTGATA- 1 9-TGTTAT 


73.51 


TTAAAA- 1 7 -TAT ATT 


envR 


envR* 


< 


HNS 


> 


acrE 


acrEF 


TTGAGT - 1 9 - AATAAT 


73.94 


CTGTCA- 1 8 -TAGAAT 


smf 


smf 


< 




> 


def 


def-fmt 


TTGCTA-1 9-GATAAG 


74.39 


TTAATA- 1 7 -TATGAT 


gspAB 


gspA 


< 


HNS 


> 


gspC 


gspD- -LMO 


TTGATT - 1 7 - TAC TAT 


76.72 


TTTACG-19 - CAT A AT 


glpEGR 


glpE 


< 




> 


glpD 


glpD 


TTG AAA- 1 9-TATAAC 


77.61 


GTGCCA-1 8-TGTAGT 


ftsYEX 


ftsY 


< 




> 


rsmD 


rsmD-yhhL 


TTCACA-1 8-TGTTAT 


78 27 


1 1 1ALA 1 / — Un 1 Inl 


yhiL 


yhiL 




HNS 








C 1 IjAAA- lb- Inl Ann 


78.33 


TTGCCC-17 -GATAAT 


yhiN 


yhiN 


< 


HNS 


> 


pitA 


pitA 


TTCACT-18- TATAAT 


78.58 


ATGACG- 1 7 -TAT AAA 


dinQ 


dinQ 


< 




> 


arsR* 


arsRBC 


ATGACG- 1 7 -TATAAA 


78.67 


TTGTCA-18-7>17"A47" 


insH-11 


insHII 


< 


HNS 


> 


sip 


sIp-dctR 


TTTACG-1 7 -TAAAAT 


78.93 


TTGAAA-1 6-TATAAG 


gadW 


gadW* 


< 


HNS 


> 


gadY 


gadY 


TTCGCA-1 8 -TATAAA 


80.49 


TTCACA-1 7-TCAAAT 


bax 


bax 


< 




> 


malS 


malS 


TTGATA- 1 8-TCA AAT 


80.81 


TTGTAA-19-TTTTAT 


yiaT 


yiaT 


< 


HNS 


> 


yiaU* 


yiaU 


TTCATA-1 8 -TAAAAT 
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Map 


Promoter sequence 


Right Operon 


Left Gene 


D 


RpoD 


D 


Right Gene 


Left Operon 


Promoter sequence 


81.24 


TATACA- IS- TA TAA T 


yiblH 


yibl 


< 




> 


mtIA 


mltADR 


TCTACA- 1 9 -TACAAT 


81.71 


TAT ACA- 1 8 - 771 TAA T 


yibB 


htrL 


< 


HNS 


> 


hldD 


rfaD-waaCL 


TTAATA-1 7 -CATAAA 


82.01 


TCTACA-1 7-TTTATT 


waaQ- -waaK 


waaQ 


< 


HNS 


> 


waaA 


waaA-coaD 


CTGACA- 1 6 -TTTTAT 


82.28 


GTCACA-16-TAAAAG 


ligB 


ligB 


< 




> 


gmk 


gmk 


GTCACA- 1 6 -AATAAG 


82.60 


TTGTAA-1 9 - TATAAT 


yicJI 


yicJ 


< 


HNS 


> 


seIC 


seIC 


TTATCA-1 8-TATAAA 


83.00 


TGGACA-1 7-GATACT 


istR 


istR 


< 


HNS 


> 


tisB 


tisA 


TTGTCC-1 7 -TATACA 


84.59 


TTTATA- 1 8 -TATGAT 


asnC-mnmG 


asnC* 


< 




> 


asnA 


asnA 


TTGATT - 1 6 -TAAAAT 


85.06 


TTGGCC-1 8 - AATAAT 


yifB 


yifB 


< 


HNS 


> 


ilvL 


HvLXGM 


TTGGCC-1 8 -AATAAT 


85.39 


TTCACC-1 9-AATAGT 


rhIB 


rhIB 


< 




> 


trxA 


trxA 


TTTACG-1 6-AATAAA 


85.83 


TTAACA- 1 8-TGTAAT 


as/A 


as/A 


< 


HNS 


> 


glmZ 


glmZ 


TTGAGA-1 7-GATGAT 


85.95 


CTCACA-1 5-TGTAAT 


hemCDKY 


hemC 


< 




> 


cyaA 


cyaA 


CTGACA-1 8-TAGGAT 


87.95 


TCGACA-1 5-TACATT 


fdoG-fdhE 


fdoG 


< 




> 


fdhD 


fdhD 


TCGACA-1 5-TACATT 


88.43 


TTTGCA-1 9-TATCGT 


cpxRA 


cpxR* 


< 




> 


cpxP 


cpxP 


ATGACG-1 9-TTTAAA 


88.70 


TGGACA-1 7-TACAAT 


glpFKX 


glpF 


< 




> 


zapB 


zapB 


TGGACA-1 7-TACAAT 


88.86 


TTTGCA- 1 8 -TATGAT 


priA 


priA 


< 




> 


rpmE 


rpmE 


CAGACA- 1 7 - TATAGC 


88.93 


TTGAGC- 1 7 -TAAAAT 


met J 


metJ* 


< 




> 


metB 


metBL 


TTGAGC - 1 7 -TAAAAT 


89.48 


TTCATA- 1 7-GATACT 


argE 


argE 


< 




> 


argC 


argCBH 


TTGACAA 8-TATCAA 


91.12 


TTGGCT-16-TCAAAT 


pepE 


pepE 


< 




> 


rluF 


rluF 


ATAACA-1 7 -TATTTT 


91.17 


TTGACA-1 6-TTTATT 


lysC 


lysC 


< 




> 


pgi 


pgi 


ATCACA- 1 8 -TACAAT 


91.65 




p/sB 


pisB 


< 




> 


dgkA 


dgkA 


T T AAC G - 1 9 - C ATA AT 


91.76 


TTGATA-1 7-CATAAC 


zur 


zur* 


< 




> 


yjbL 


yjbLM 


TTGTCG-1 8 -AATAAT 


91.84 




qorA 


qorA 


< 




> 


dnaB 


dnaB 


TCGTCA- 1 7 - TAAAGT 


92.01 


ATGCCA- 1 5 -TTTAAT 


uvrA 


uvrA 


< 




> 


ssb 


ssb 


TTGACC-18-TGGAAT 


92.09 


CTAACA- 1 5 -TATAGT 


yjcB 


yjcB 


< 


HNS 


> 


yjcC 


yjcC 


TTTTCA-1 6-TATAAA 


94.13 


GTGAAA-18-TTTCAT 


yjeH 


yjeH 


< 




> 


groS 


groSL 


TTTTCA- 1 7 -CAGAAT 


95.19 


ATCACA- 1 8 -TATCAT 


ulaG 


ulaG 


< 


HNS 


> 


ulaA 


ulaA-EF 


T T AAC T - 1 5 - GATAAT 


95.32 


TTGATT-15-GATCAT 


yjfY 


yjfY 


< 




> 


rpsF 


rpsF- -rpll 


TTCAAA- 1 7 -TGTGAT 


95.37 


TTGATT-15-GATCAT 


yjfZ 


yjfZ 


< 


HNS 


> 


ytfA* 


ytfA 


TTCACA-1 6-AATAAA 


95.51 


TCGACA-1 5-TACATT 


qorB 


qorB 


< 




> 


ytfH* 


ytfH 




96.37 


7TG/101-17-TGTAAT 


bdcA 


bdcA 


< 




> 


bdcR* 


bdcR 


TTGATT - 1 7 - TAG AAA 


96.46 


TTGCAA- 1 5 -TAT AAA 


argl 


argl 


< 




> 


rraB 


rraB 


T TAAAA- 1 6 - GAT TAT 


96.50 


TTGATA-1 9-TAAAAT 


yjgM 


yjgM 


< 


HNS 


> 


yjgN 


yjgN 


TTGCCA-1 8-TATTGT 


96.99 


TTAACA- 17-GATAAA 


insG 


insG 


< 




> 


yjhB 


yjhBC 


AAGACA- 1 7 -TATTGT 


97.13 


TTAACG- 1 7 -TAGAAT 


InsM 


insM 


< 




> 


yjhV 


yjhV 


CTGTCA- 1 6 -TATAAA 


97.33 


TTCTCA- 1 7 -GATAAT 


far-ID 

recin 


fori** 

reel 








insA7 


insA-7 


■ 1 A A /" A 1"7TATAA/" 

1 1 MALA- 1 /- 1 A 1 AAu 


97.78 


TCAACA- 1 7 -TTTAAT 


nanCM 


nanC 


< 


HNS 


> 


fimB 


fimB 


TTGGCA-1 6 -TATATT 


98.46 


TTC ACA- 1 9-TTTTAT 


yjiR 


yjiR* 


< 




> 


yjiS 


yjiS 


TTAACC - 1 5 - TAAAAG 


99.15 


ATGAAA- 1 7 -TTTAAT 


yjj° 


yjjP 


< 


HNS 


> 


yjjQ* 


yjjQ-bglJ 


CTGATA- 1 7 -GATAAT 


99.21 


TTGATA-1 9-GATAAT 


fhuF 


fhuF 


< 




> 


yjjz 


yjjz 


T TGCAA- 1 6 -TATGAT 


99.96 




arcA 


arc A* 


< 




> 


yjjY 


yjjY 


TTGCCA-1 9-TACAAA 






300 genes (a) 


178 |b) 


K 


HNS 




178 (a) 


291 genes (b) 





1.68 (a/b) 26 TFs 64 19 TFs 1.63 (a/b) 
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Table 1. Cont. 





Map 


Promoter sequence 


Right Operon Left Gene D 


RpoD D 


Right Gene Left Operon Promoter sequence 






63 Y-genes 


36% 


74 Y-genes 






20 essential 




19 essential 



A total of 1,075 RpoD holoenzyme-binding sites were identified within spacers on the entire E. coli K-12 W3110 genome. The constitutive promoters were predicted 
within type-A and type-B intergenic spacers {see Fig. 1A for classficiation). A total of 178 RNA polymerase RpoD holoenzyme-binding sites were identified within type-A 
spacers, which direct bidirectional transcription. Based on the gene orientation around these promoters, the genes and operons under the control of these promoters 
were estimated, that are located on either left side (left gene column) or right side (right gene column} of the respective spacers. Genes encoding transcription factors 
are indicated by star symbols (*) and the operons are shown in the operon columns [note that only the first and the last genes are shown for polycitronic operons]. The 
directions of transcription for these flanking genes are shown by arrows in column D. The map positions of left-side and right-side genes are shown in the map columns. 
The essential genes listed in the PEC database are underlined. The promoter sequences were predicted according to the analysis procedure described in Materials and 
Methods. For some spacers, multiple promoters were identified, of which the best-match promoters with the highest scores are described. The promoter sequence with 
complete match with the canonical promoter (see Fig. 4) is shown in bold and italic while the promoter sequence with 5-out of-6 match is shown in bold. The spacer 
including H-NS binding sites are shown as HNS mark in the spacer column. The numbers of hitherto identified promoters are 121 and 133 for left-ward and right-ward 
transcription, respectively, which correspond to 68 and 75%. Total number of genes under the control of 1 78 promoters were 300 for left-ward transcription, and 291 for 
right-ward transcription. The average numbers of genes under one promoter are 1 .68 and 1 .63 for left -ward and right-ward transcription, respectively. Among the total 
of 178 RpoD holoenyme-binding sites, 64 (36%) overlap with the H-NS-binding sites. 
doi:1 0.1 371 /journal.pone.0090447.t001 



and then a mixture of substrates and heparin was added to allow 
the single-round transcription. The final level of transcripts 
represents the amount of RNA synthesized in 15 min reaction 
after the addition of substrate mixture into open complexes formed 
during preincubation for various times up to 30 min (referred to 
parameter-I in this study). Parameter-I represents the binding 
affinity of RpoD holoenzyme to the test promoter. On the other 
hand, the slope of transcript increase represents the rate of open 
complex formation (referred to parameter-II). 

Both parameter-I and -II were determined for each variant 
promoter for three times and the average values are shown in 
Fig. 4. In each panel, the promoter activity is compared between 
four templates with different bases at the same position. Among 
the collection of -35 variants, the best promoter giving the highest 
activity of open complex formation (parameter-I) was identified for 
the consensus TTGACA sequence (Fig. 4A). This indicates that 
the consensus sequence of promoter -35 influences the binding 
affinity of RNA polymerase to the promoter in agreement with the 
previous estimation [23]. As to the promoter -10 signal, the best 
sequence giving the highest rate of open complex formation 
(parameter-II) was identified for the consensus TATAAT (Fig. 4B), 
indicating that promoter -10 influences the rate of promoter 
opening. 

Using this experimentally confirmed consensus sequence, 
TTGACA- 1 7 bp-TATAAT, of the constitutive promoter recog- 
nized by RpoD holoenzyme alone, we search for the location of 
constitutive promoters within both type-A and type-B spacers 
including RpoD holoenzyme-binding sites. 

Unique features of the RpoD constitutive promoters 

After sequence analysis of the entire genome of E. coli K-12, we 
realized that there is no ideal sequence of RpoD promoter with 
perfect matching to this consensus sequence. We then analyzed 
whether the constitutive RpoD promoters harbor unique sequenc- 
es. By setting a rather severe screening condition of the sequence 
matching of more than 4 out of 6 bases for both -35 and -10 signals 
(total score, higher than 8; the highest score, 12) and with a spacer 
length of 17 plus/minus 2 (score 3 for 17 bp spacer, score 2 for 16 
and 1 8 bp spacers, and score 1 for 1 5 and 1 9 bp spacers), a total of 
as many as 316 promoter sequences (89%) were identified among 
354 predicted constitutive promoters within type-A spacer, and a 
total of 226 promoter sequences (82%) were identified among 276 
predicted promoters excluding the internal promoters within type- 



B spacer. Overall the total amount of constitutive promoters with 
high-level (higher than 4/6) matching with the consensus RpoD 
promoter for both -36 and -10 signals is more than 85% (Fig. 5A- 
1). This is in sharp contrast with the collection of 582 
experimentally defined promoters, in which the amount of 
promoters containing a high-level (4/6) agreement to both -35 
or -10 signals are less than 40% (Fig. 5B-1). On average, only less 
than half of the 12 canonical bases of the -35 and -10 boxes are 
conserved among the experimentally identified promoters [39]. 
The length of spacers between -35 and -10 signals ranges from 
minimum 14 to maximum 21 bp. Only 10-20% efficient 
promoters have been proposed to include either -35 or -10 box 
that resembles the consensus with five-out-of-six bases (5/6 
agreement). 

Using a total of 669 sequences of the constitutive promoter with 
high-level of sequence conservation, the consensus sequence of 
RpoD promoter was reevaluated. The Logo plot, shown in 
Fig. 5A-2, indicates that: 1) the patterns of conserved bases of both 
-35 and -10 sequences are similar for both left and right genes with 
both type-A and type-B spacers (data not shown); and 2) a 
significant difference exists in the consensus sequence of both -35 
and -10 signals between a total of 543 newly identified constitutive 
promoters and a total of 582 experimentally identified promoters. 
The Logo pattern generated from the alignment of a total of 582 
experimentally identified E. coli promoters, shown in Fig. 5B-2, 
indicates that the level of sequence conservation is significandy 
higher for -10 than -35, and the conserved bases are limited, TT 
(positions 1 and 2) of -35 signal and TA (positions 1 and 2) and T 
(position 6) of - 1 0 signal. This Logo pattern agrees well with that 
analyzed by Mitchell et al. [40] using the promoter set of 
RegulonDB [41]. In the case of constitutive promoters, -35 and 
-10 sequences are equally well-conserved (Fig. 5A-2). Within -35 
sequence, TT (position 1 and 2) and AA (positions 4 and 6) are 
well-conserved but G at position 3 and C at position 5 appear less 
important even for the constitutive promoters. On the other hand, 
all six bases of TATAAT within -10 signal are equally well- 
conserved, indicating their equal contribution to the promoter 
activity. The marked difference of the consensus sequence of 
experimentally identified RpoD promoters from that of constitu- 
tive promoters indicates that the promoter databases include a 
number of inducible promoters that were active under the support 
of positive TFs. 
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Constitutive Promoters of RpoD Sigma Factor 



Table 2. Constitutive Promoters (Type-B Spacers) (Leftward transcription). 



Map 


Promoter sequence 


Left Operon 


Left Gene 


D 


RpoD 


D 


Right Gene 


Right Operon 




0.44 


CTGCCA- 1 6 -TATGAT 


insAB-1 


ins A- 7 


< 




< 


rpsT 


rpsT 




0.85 




caiABCDE 


caiA 


< 




< 


caiT 


caiTABCDE 




3.96 




yaeH 


yaeH 


< 




< 


yael 


yael 




5.76 


T TGAAA- 1 5 -TATCAT 


ykfA 


ykfA 


< 




< 


perR 


perR 




5.99 


Internal Promoter 


(insAB-afuBC) 


afuB 


< 




< 


ykgN 


(yagAB-ykgN) 




6.02 


Internal Promoter 


(yagAB-ykgN) 


ykgN 


< 




< 


yagB 


(yagAB-ykgN} 




6.28 


T TTAGA- 1 6 - TA TAA T 


yagK 


yagK 


< 


HNS 


< 


yagL 


yagL 




6.34 


T TAAAA- 1 7 -TATCAT 


yagN 


yagN 


< 


HNS 


< 


intF 


intF 




6.72 


TTGAAA-1 7-TATCTT 


ykgMO 


ykgM 


< 


HNS 


< 


ykgR 


ykgR 




6,84 




ykgIB 


ykgl 


< 


HNS 


< 


ykgC 


ykgC 




6.98 


GCGACA- 1 6 -TATATT 


ykgH 


ykgH 


< 


HNS 


< 


betA 


betA 




7.77 


Internal Promoter 


(lacZYA) 


lacA 


< 




< 


lacY 


(lacZYA) 




8.14 


Internal Promoter 


(frmRAB) 


frmA 


< 




< 


frmR 


frmRAB 




8.17 


TTGACA-15-TATAGT 


frmRAB 


frmR* 


< 




< 


yaiO 


yaiO 




9.47 


Internal Promoter 


(xseB-ispA -dxs) 


ispA 


< 




< 


xseB 


(xseB-ispA -dxs) 




9.63 


Internal Promoter 


(cyoABCDE) 


cyoD 


< 




< 


cyoC 


(cyoABCDE) 




10.27 


TCCACA- 1 7 -TACACT 


ylaB 


ylaB 


< 




< 


ylaC 


ylaC 




10.30 




ylaC 


ylaC 


< 




< 


maa 


maa 




10.33 




hha 


hha 


< 




< 


tomB 


tomB-hha 




10.34 


ATGAAA- 1 7 -TATAGT 


tomB-hha 


tomB 


< 


HNS 


< 


acrB 


(acrAB) 




14.19 


ATGGCA-17-TACATT 


lipA 


lipA 


< 




< 


ybeF 


ybeF 




14.22 


TTTACA-15 -TATATT 


ybeF 


ybeF* 


< 


HNS 


< 


lipB 


lipB 




15.30 


TTGTAA- 1 8 -TACAAT 


uof-fur 


uof 


< 




< 


fldA 


fldA 




15.31 


TGGGCA- 1 8 -AATAAG 


fldA 


fldA 


< 




< 


ybfE 


ybfE 




15.32 


TTGGCG-18-TATTAT 


ybfE 


ybfE 


< 




< 


ybfF 


ybfF 




16.20 


TAAAC A- 1 6 -TAAAAT 


ybgD 


ybgD 


< 


HNS 


< 


gltA 


gltA 




16.94 


TTCAAA- 1 7 -CATATT 


gpmA 


gpmA 


< 




< 


galM 


(galETKM) 




17.03 


ATGAAA- 1 7 -TAAAAA 


galETKM 


galE 


< 


HNS 


< 


modF 


modEF 




19.38 


TTGTCC-17-TAAATT 


artl 


artJ 


< 




< 


artM 


artM 




19.45 


T T AAC T - 1 8 - C ATA AT 


artPIQM 


artP 


< 


HNS 


< 


ybjP 


ybjP 




19.68 


TTGACG-1 9-TGTAAT 


ybjE 


ybjE 


< 




< 


aqpZ 


aqpZ 




20.54 


TTTACA-1 7-AATAAT 


focA-pfIB 


focA 


< 




< 


ycaO 


ycaO 




22.15 


TTGAAA-1 6-TATATC 


hspQ 


hspQ 


< 




< 


rlml 


rlml 




24.14 


ATGACA- 1 7 -TATAAA 


bssS 


bssS 


< 


HNS 


< 


dinl 


dinl 




25.21 


TTGCTA-16-GATAAT 


mfd 


mfd 


< 




< 


ycfT 


ycfT 




25.49 


Internal Promoter 


(potABCD) 


potB 


< 




< 


potA 


(potABCD) 




26.11 


TTCATA- 1 7 -C ATA AT 


iraM 


iraM 


< 


HNS 


< 


ycgX 


ycgX 




26.13 


TTTAAA- 1 6 -AATAAT 


bluR 


bluR* 


< 


HNS 


< 


bluF 


bluF 




27.07 


Internal Promoter 


iychF) 


ychF 


< 




< 


pth 


pth 




28.17 


TTAACA-1 7-TATGTT 


kch 


kch 


< 




< 


ycil 


ycil 




28.91 


TTAACA-17 -TATTAT 


osmB 


osmB 


< 


HNS 


< 


yciT 


yciT 




28.92 


TTGAGG-1 6-TATTTT 


deoT 


deoT 


< 




< 


yciZ 


yciZ 




29.06 


GTGAAA- 1 7 -GAGAAT 


fabl 


fabl 


< 




< 


ycjD 


ycjD 




29.08 


CTGACA- 1 6 -CAGAAT 


ycjD 


ycjD 


< 




< 


sapF 


sapDF 




29.17 


ATGACA- 15 -TTTAAT 


sapABCDF 


sapA 


< 




< 


ymjA 


ymjA 




30.85 


TGTAC A- 1 6 - AATAAT 


pinR 


pinR 


< 




< 


ynaE 


ynaE 
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Table 2. Cont. 





Map 


Promoter sequence 


Left Operon 


Left Gene 


D 


RpoD 


D 


Right Gene 


Right Operon 


30.86 


CTGACA- 1 7 -TACCAT 


ynaE 


ynaE* 


< 


HNS 


< 


uspF 


uspF 


31.54 


CTGACA-1 7-AATAAC 


ynbG 


ynbG 


< 




< 


insC-2 


insCD-2 


33.21 


TTCACC-16-TCTAAT 


narU 


narU 


< 




< 


yddJ 


yddJ 


33.25 


Internal Promoter 


(yddLKJ) 


yddK 


< 




< 


yddL 


yddLKJ 


33.42 


Internal Promoter 


yddM 


yddM 


< 




< 


adhP 


adhP 


33.58 


Internal Promoter 


(ddpXABCDFI 


ddpB 


< 




< 


ddpA 


(ddpXABCDF) 


34.03 


TTGTCA-1 6-TATTAA 


ydeN 


ydeN 


< 




< 


ydeO 


ydeO 


34.10 


T T GAAG -16 -TATATT 


ydeP 


ydeP 


< 


HNS 


< 


ydeQ 


ydeQ 


34.27 


TTGACT-1 6-TAAAAC 


hipBA 


hipB 


< 




< 


yneO 


yneO 


34.28 


CTGACA- 1 7 -TTTAAT 


yneO 


yneO 


< 


HNS 


< 


IsrK 


IsrRK 


34.68 


TTGCCG-1 9-TATCTT 


yneF 


yneF 


< 




< 


yneG 


yneG 


34.71 


TTTTCA-17-TAGAAA 


yneHG 


yneH 


< 




< 


sad 


sad 


35.31 


TTTATA-16 - AATAAT 


essQ- -rrrQ-ydfP 


essQ 


< 


HNS 


< 


cspB 


cspB 


35.58 


Internal Promoter 


(rspAB) 


rspB 


< 




< 


rspA 


rspAB 


35.60 


TTGTCA- 1 7 -TATACG 


rspAB 


rspA 


< 




< 


ynfA 


ynfA 


36.28 


TTAACG- 1 7 -AAAAAT 


fumC 


fumC 


< 




< 


fumA 


fumA 


36.52 


TT AAC C - 1 7 - T AT AC G 


uidR 


uidR* 


< 




< 


hdhA 


hdhA 


37.12 


TTCAAA-15-TACACT 


sodC 


sodC 


< 




< 


ydhF 


ydhF 


37.98 


ATCACA- 1 6 -GATAAT 


sufABCDSE 


sufA 


< 




< 


rydB 


rydB 


38.00 


TTGTCA-1 6-CATATT 


yd/H 


ydiH 


< 




< 


ydil 


ydiJI 


38.76 




infC-rpml-rpIT 


infC 


< 




< 


thrS 


thrS- -rplT-pheST 


39.02 


AT GAC T - 1 6 - AATAAT 


ydjO 


ydjO 


< 


HNS 


< 


cedA 


cedA 


39.28 




ves 


ves 


< 




< 


spy 


spy 


39.92 


TTGAAA- 1 7 -GATAAT 


ydjF 


ydjF* 


< 




< 


ydjG 


ydjG 


40.46 


TTGCCC-1 9-TTTTAT 


yeaQ 


yeaQ 


< 




< 


yoaG 


yoaG 


40.65 


T TAATA- 1 8 -TATCAT 


fadD 


fadD 


< 




< 


yeaY 


yeaY 


41.07 


TTGCCA- 1 7 -GATAAC 


yobF-cspC 


yobF 


< 


HNS 


< 


yebO 


yebO 


41.18 




pre 


pre 


< 




< 


proQ 


proQ 


41.53 


TTGTCC- 1 5 -CACAAT 


yebE 


yebE 


< 




< 


yebF 


yebF 


41.62 


TTCACC-17-TACACT 


edd-eda 


edd 


< 




< 


zwf 


zwf 


41.89 


TGGATA-17 -TATCAT 


ruvAB 


ruvA 


< 


HNS 


< 


yobl 


yobl 


42.14 


GTGACA- 1 8-TAA AAA 


torYZ 


torY 


< 


HNS 


< 


cutC 


cutC 


42.90 




pgsA 


pgsA 


< 




< 


uvrC 


uvrC 


42.98 


TTGCAA- 1 7 -AATAAT 


sdIA 


sdiA* 


< 


HNS 


< 


yecC 


yecC 


43.07 


Internal Promoter 


(fliAZY) 


fliZ 


< 




< 


fliA 


fliAZY 


44.47 


TTG AGA- 1 6-TATATT 


cobUST 


cobU 


< 




< 


insH-6 


insH-6 


44.49 


TGTACA- 1 7 -CATGAT 


yoeG 


yoeG 


< 




< 


yoeH 


yoeH 


44.55 


TTTTCA-1 8 - TA TAA T 


yoeH 


yoeH 


< 


HNS 


< 


insD-3 


insD-3 


44.77 


TTGTCA-1 5-AGTAAT 


yeeX 


yeeX 


< 




< 


yeeA 


yeeA 


44.95 




yeeY 


yeeY 


< 




< 


yeeZ 


yeeZ 


45.26 


ATTACA- 1 9 -TATCAT 


insH-7 


insH-7 


< 


HNS 


< 


wbbK 


(wbblJK) 


45.37 




glf-wbbH 


glf 


< 




< 


rfbX 


(rfbBDACX) 


45.48 


TTCACA-1 8-TGGAAT 


rfbBDACX 


rfbB 


< 




< 


galF 


galF 


46.67 




ogrK 


ogrK 


< 




< 


yegZ 


yegZ 


46.92 


Internal Promoter 


(gatZABCD) 


gatB 


< 




< 


gatA 


(gatZABCD) 


47.00 




yegX 


yegX 


< 




< 


thiD 


thiMD 


47.63 




yehS 


yehS 


< 




< 


yehT 


yehT 
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Map 


Promoter sequence 


Left Operon 


Left Gene 


D 


RpoD 


D 


Right Gene 


Right Operon 


47.89 


TTGACG-1 8-TATG AT 


pbpG 


pbpG 


< 




< 


yohC 


yohC 


49.05 


TTGATG- 1 7 -TATCAT 


yejG 


yejG 


< 




< 


bcr 


bcr 


49.64 


TTGATG-1 6-TGCAAT 


mqo 


mqo 


< 


HNS 


< 


yojl 


yojl 


51.11 


GTGAAA-1 6-GATAAT 


pmrD 


pmrD 


< 




< 


menE 


menE 


51.61 


Internal Promoter 


(nuoABC- -HIJKLj 


nuoH 


< 




< 


nuoG 


(nuoAB- -GHIJKLj 


51.73 


Internal Promoter 


(nuoABC- -HIJKLj 


nuoC 


< 




< 


nuoB 


(nuoAB- -GHIJKLj 


51.78 


TTGACAA 8-TAAAAA 


nuoAB- -IJKLMN 


nuoA 


< 




< 


IrhA 


IrhA 


52.25 


TTG AAA- 1 6-TTTAAT 


hisJQMP 


hisJ 


< 




< 


argT 


argT 


52.87 


TTGAAA-1 7-TATAGT 


yfcV 


yfcV 


< 


HNS 


< 


sixA 


sixA 


53.61 


CTGACA-1 9-CATTAT 


yfdV 


yfdV 


< 




< 


oxc 


oxc 


53.67 


TTTATA- 1 8 -AATAAT 


frc 


frc 


< 


HNS 


< 


yfdX 


yldX 


57.16 




sseB 


sseB 


< 




< 


pepB 


pepB 


57.96 


T TAAC T - 1 7 -TCTAAT 


glmY 


glmY 


< 




< 


purL 


purL 


58.68 


ATGATA- 1 5 -AATATT 


kgtP 


kgtP 


< 




< 


rrfG 


(rrsG- -rrlG-rrfG) 


59.52 


T AG AC G - 1 8 - T G GAAT 


yfjLK 


yfjL 


< 




< 


yfJM 


yfjM 


59.97 


TTATCA-18 -TTTAAT 


ypjc 


ypjc 


< 


HNS 


< 


ileY 


ileY 


61.31 


Internal Promoter 


(hycABCDEFGHIj 


hycD 


< 




< 


hycC 


(hycABCDEFGHIj 


62.01 


Internal Promoter 


(cusABC -ygbTF) 


ygbT 


< 




< 


casE 


(cusABC- -ygbTF) 


62.70 


TTGATA-1 5 TATGAT 


mazEFG 


chpR 


< 




< 


relA 


relA 


64.26 


TTGAAA-1 7-TATC AT 


kdul 


kdul 


< 


HNS 


< 


yqeF 


yqeF 


64.40 




yqeL 


yqeL 


< 




< 


yqeK 


yqeK 


65.59 


GTGACG-15-TTCAAT 


ygfF 


ygfF 


< 




< 


gcvP 


gcvP 


66.18 


TTCCCA-1 6-TGTGAT 


epd-pgk-fbaA 


epd 


< 


HNS 


< 


yggC 


(yggDC) 


67.03 


GTGACG-15-TTCAAT 


yghF 


yghF 


< 




< 


yghG 


yghG 


67.77 


TTGCCT-17-GACAAT 


yghW 


yghW 


< 




< 


yghX 


yghX 


68.24 


TTAACC- 1 5 -TAAAGT 


mqsRA 


mqsR 


< 




< 


ygiV 


ygiV 


68.40 


Internal Promoter 


(nudF- -yqiA-parE) 


yqiA 


< 




< 


cpdA 


(nudF- -yqiA-parE) 


68.81 


TTGACG-1 7-TAAAGT 


sibD 


sibD 


< 




< 


sibE 


sibE 


71.46 




rimP-nusA -infB 


rimP 


< 




< 


metY 


metY 


71.60 




folP-glmM 


folP 


< 




< 


ftsH 


ftsH 


72.18 


TTGAGG- 1 8 -CACAAT 


arcB 


arcB 


< 




< 


yhcC 


yhcC 


72.74 


TGGCCA-1 8-TAAAAA 


sspAB 


sspA 


< 




< 


rpsl 


rpsl 


74.68 


TTGAAA-1 7-TATTTT 


bfd-bfr 


bfd 


< 




< 


chiA 


chiA 


74.87 


TGGAAA- 1 6 -ATTAAT 


yheO-tusDCB 


yheO* 


< 




< 


fkpA 


fkpA 


75.16 


TTGCCA-1 7-CATATT 


argD 


argD 


< 




< 


pabA 


pabA 


76.67 




glpR 


glpR* 


< 




< 


glpG 


glpG 


77.07 


TTAGCA- 1 7 -TTTAGT 


gntR 


gntR* 


< 




< 


yhhW 


yhhW 


77.55 


TTCACA-1 9-GATAAA 


rpoH 


rpoH 


< 




< 


ftsX 


ftsX 


78.19 


TAGACA-1 6-TACTAT 


yhil-rbbA-yhhJ 


yhil 


< 




< 


yhU 


yhU 


78.22 


TTGACG-1 9- TATAAT 


yhU 


yhU 


< 


HNS 


< 


yh/'L 


yhiL 


78.76 


TGAACA- 1 7 -TAAAAG 


(hdeABD) 


hdeB 


< 


HNS 


< 


hdeA 


hdeABD 


78.95 


TTAATA-1 6 -TGTAAT 


gadX 


gadX* 


< 


HNS 


< 


gadA 


gadA 


78.98 


TTAATA-17 -TATATT 


gadAX 


gadA 


< 


HNS 


< 


yhjA 


yhjA 


79.79 


Internal Promoter 


(dppABCDF) 


dppC 


< 




< 


dppB 


(dppABCDF) 


80.87 


C AG AC A- 1 7 -TATAAA 


yiaWV 


yiaW 


< 


HNS 


< 


aldB 


aldB 


83.00 


CTGAAA- 1 9 -TGTAAA 


ivbL- -uhpABC 


ivbL 


< 




< 


istR 


istR 
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Table 2. Cont. 





Map 


Promoter sequence 


Left Operon 


Left Gene 


D 


RpoD 


D 


Right Gene 


Right Operon 


83.53 




gyrB 


gyrB 


< 




< 


recF 


recF 


84.18 




pstB-phoU 


pstB 


< 




< 


pstA 


(pstCA) 


85.30 


CTG AC A- 1 8-G ATCAT 


ppiC 


ppiC 


< 




< 


yifO 


yifO 


86.22 




yigF 


yigF 


< 




< 


yigG 


yigG 


87.36 




glnLG 


glnL 


< 




< 


glnA 


glnA 


88.73 


TTGCCA-1 8 TATACT 


rraA 


rraA 


< 




< 


menA 


menA-rraA 


92.24 


TTCCCA-15-TAAACT 


yjcF 


yjcF 


< 


HNS 


< 


actP 


actP 


92.56 




yjcO 


yjcO 


< 




< 


fdhF 


fdhF 


93.18 


T TAT CA- 1 6- TA TAAA 


yjdN 


yjdN 


< 




< 


yjdM 


yjdM 


93.46 


TTTACA-17-GATACT 


adiA 


adiA 


< 


HNS 


< 


meIR 


meIR 


93.90 


T T G A G T - 1 9 - TA TAA T 


cadBA 


cadB 


< 


HNS 


< 


cadC 


cadC 


94.05 


ATAACA- 1 7 -TAAAAA 


dcuA 


dcuA 


< 




< 


aspA 


aspA-dcuA 


94.22 


TTAACC-1 7 -TAGAGT 


yjeJ 


yjeJ 


< 




< 


epmB 


epmB 


95.44 


TTTTCA- 1 6-AAAAAT 


nrdD 


nrdD 


< 




< 


treC 


treC 


97.36 


T TAAGA- 1 5 -TTTAAT 


yjhU 


yjhU* 


< 




< 


yjhF 


yjhF 


97.49 


TTCAAA- 1 7 -TTTAAT 


yjhIHG 


yjhl* 


< 




< 


sgcR 


sgcR 


97.61 


TTTACC-17-TATCAC 


sgcXBCQAER 


sgcX 


< 




< 


yjhP 


yjhQP 


98.20 




iadA 


iadA 


< 




< 


yjiG 


(yjiHG-iadA) 


98.30 


CTGACC-19-TACAAT 


yjiK 


yjiK 


< 




< 


yjiL 


yjiL 


98.63 


TTTACC-1 6-AAAAAT 


mcrBC 


mcrB 


< 




< 


symE 


symE 


99.60 




IplA 


IplA 


< 




< 


ytjB 


ytjB 






290 (a) 


181 (b) 




HNS 












1.80 (a/b) 


15 TFs 




39 












69 Y-genes 






24% 








14 essential 



Among the total of 1,075 RpoD holoenzyme-binding sites, 181 are located within type-B spacers upstream of left-side genes and downstream of right-side genes, 
indicating that these promoters direct leftward transcription. The genes and operons under the control of these 1 81 promoters were estimated, of which 16 represent 
putative internal promoters. Descriptions and symbols are as in Table 1 . A total of 1 5 genes encoding transcription factors are indicated by star symbols (*). The essential 
genes listed in the PEC database are underlined within the operons. The promoter sequence with complete match with the canonical promoter (see Fig. 4) is shown in 
bold and italic while the promoter sequence with 5-out of-6 match is shown in bold. The spacers including H-NS binding sites are marked as HNS in the spacer column. 
Total number of genes under the control of these 181 promoters were 290 (1 .80 gene per promoter}. Among the total of 1 81 RpoD holoenyme-binding sites, 39 (24%) 
overlap with the H-NS-binding sites. 
doi:1 0.1 371 /journal.pone.0090447.t002 



Constitutive promoters carrying the binding site of 
silencer H-NS 

The constitutive promoters direct transcription by the RpoD 
holoezyme alone in the absence of positive transcription factors. 
Expression in vivo of the constitutive promoters must be repressed 
under conditions where the target genes are not needed. One 
possible transcription factor with the silencing activity is H-NS, 
one of the major core nucleoid proteins with functional dichotomy 
[42] . H-NS is known as a universal silencer for repression of a 
number of xenogeneic genes [43,44]. In parallel with the mapping 
of RpoD promoters, we identified a total of 987 H-NS-binding on 
the E. coli genome using the Genomic SELEX system [45] . Within 
a total of 492 spacers (177 type-A and 315 type-B) with RpoD 
holoenzyme-binding sites, H-NS binding sites were identified 
within 63 type-A (36%) (indicated as H-NS in column P of Table 1) 
and 77 type-B spacers (24%) (Tables 2 and 3). 

The genome-silencing function of H-NS is attributable to its 
unique mode of DNA binding. H-NS recognizes and binds to 
intrinsically curved (or bent) DNA with AT-rich sequences, and 
such curved sequences are often located near promoters and also 



within transported DNA segments such as phage genomes [46,47]. 
Promoter-associated DNA curvature provides H-NS with the 
initial contact site, leading to form DNA-H-NS filaments via 
cooperative protein-protein interaction [48]. Since once H-NS 
binds within one spacer, it could be fully covered with H-NS by 
spreading from the initial binding sites through high level of 
protein-protein cooperativity. Thus, if H-NS binds within one 
type-A spacer, it influences the silencing effect to promoters for 
both of the bidirectional transcription. Taken together we 
estimated that a total of 203 promoters (63x2 + 77) or 31% 
among a total of 669 constitutive promoters could be under the 
control of genome silencer H-NS, indicating that one third of the 
constitutive promoters are subject to silencing by H-NS. 

Among the total of 203 possible targets of H-NS silencing, a 
total of 20 promoters are known to be under the control of H-NS. 
Up to the present time, however, the involvement of H-NS in 
promoter regulation has not analyzed seriously because this 
bifunctional nucleoid protein is abundant, playing both architec- 
tural and regularatory roles. The involvement of H-NS in 
transcription regulation of a specific gene has only been identified 
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during the systematic search for transcription factors. For instance, 
the csgD (the master regulator of biofifm formation), the ndh gene 
(NADH:ubiqinone oxidoreductase), and the gadX gene (the stress- 
response regulator for acid resistance) are all under the control of 
more than 10 transcription factors including H-NS [4,22]. By 
using the newly developed PS-TF (promoter-specific transcription 
factor) screening system, we have identified the involvement of FI- 
NS for regulation of a set of genes that are all under the control of 
multiple transcription factors [49]. These findings altogether 
suggest that the involvement of H-NS in transcription silencing 
of a number of constitutive promoters will be established once PS- 
TF screening is performed. 

Discussion 

Definition of the constitutive promoters 

The pattern of genome transcription in E. coli is determined by 
controlling the utilization of a limited number of RNA polymerase 
[4,22]. High-throughput systems have been developed for 
identification of the RNA polymerase distribution within the 
genome. For instance, large-scale mapping of the functioning 
promoters in vivo on the E. coli genome has been performed by the 
chromatin immuno-precipitation and microarray (ChlP-chip) 
analysis with specific antibodies against RNA polymerase subunits 
[14,15]. To prevent moving RNA polymerase along DNA, E. coli 
cells were treated with rifampicin, which binds to the RpoB 
subunit of RNA polymerase and blocks transcription initiation, 
thereby fixing the initiated RNA polymerase on promoters [50]. 
For identification of the distribution pattern of functioning RNA 
polymerase, RNA-Seq analysis of high throughput sequencing of 
transcriptomes is becoming the method of choice [16-18]. 
Concomitant with the increase in the number of promoters 
detected under various stressful culture conditions, the variation of 
promoter sequence is expanding. The fluctuation of functioning 
promoters is attributable to the modulation of promoter selectivity 
of RNA polymerase by sigma factors and transcription factors with 
positive and negative regulatory functions [3,4] . 

The constitutive promoter has been considered to represent a 
set of promoters associated with the genes constitutively expressed 
in vivo in all circumstances. Based on this definition, however, it is 
practically impossible to identify the whole set of constitutive 
promoters of E. coli under various environmental conditions. At 
present, the whole sets of promoters determined by high-through 
put procedures (see above) are accumulating, but the experiments 
have been performed using different E. coli strains, under different 
culture conditions and using different experimental systems. In 
addition, even in laboratory culture conditions, it now turned clear 
that the steady state of cell growth does not exist and moreover, 
the genome expression pattern varies between individual cells 
within the same culture. In contrast, the whole set of constitutive 
promoters can be identified in vitro because in the case of is. coli, the 
faithful transcription can be established in vitro using purified RNA 
polymerase and pure DNA template under defined conditions. To 
avoid the complexity arisen from in vivo determination of the 
functioning promoters, an attempt was then made in this study to 
identify the whole set of the constitutive promoters using an in vitro 
system. Based on the results, we propose to revise the definition of 
the constitutive promoter as the promoter that is recognized and 
transcribed by the RNA polymerase holoenzyme alone in the 
absence of supporting TFs. 

SELEX-chip search for the constitutive promoters 

For this purpose, we used the improved method of Genetic 
SELEX screening [21], which has been successfully employed for 



identification of binding sites of a number of TFs on the E. coli 
genome [22]. As a result, the number of regulation targets 
markedly increased even for the TFs with known regulatory 
functions. For instance, the number of regulation targets increased 
more than 2.5 fold from 150 to 350 even for the best characterized 
transcription factor, CRP (cAMP receptor protein) or CAP 
(catabolite activator protein) [51]. This experimental system is 
particularly useful for short-cut estimation of the regulation targets 
of uncharacterized TFs including YbjK (renamed to RcdA), YcdC 
(renamed to RutR), YcjZ (renamed to PgrR), YdhM (renamed to 
NemR) and YgiP (renamed to Dan) [22]. The Genomic SELEX 
screening system has also been successfully employed for detection 
of the alteration of promoter-recognition properties of transcrip- 
tion factors after phosphorylation (in the case of two-component 
systems) or interaction with effector ligands. For instance, the 
selection of regulation targets of SdiA, a regulator of genes for cell 
division and differentiation, was found to alter differently in the 
presence of each of three homoserine lactone analogs, the QS 
signals [52]. 

As an extension of screening of the whole set of RpoD 
promoters, we have successfully performed the screening of 
promoters recognized by RNA holoenzymes containing minor 
sigma factors, RpoS, RpoH, RpoF, RpoE and Feci. In the case of 
RpoN sigma factor, it requires enhancer-binding proteins such as 
NtrC and for formation of stable RNA polymerase-promoter 
complexes. Results will be described elsewhere. 

Physiological roles of the constitutive promoters 

The genes under the control of constitutive promoters are 
supposed to be expressed constitutively although unnecessary 
genes are subject to repression by silencers (see the H-NS chapter). 
One short-cut interpretation is that the essential genes are under 
the control of constitutive promoters. After systematic deletion of 
the genome, the total number of essential genes in the E. coli 
genome has been minimized to generate the minimal genome 
consisting of 302 essential genes ([53]; also see PEC database). 
Among 302 essential genes, 70 (23%) were identified to be under 
the control of constitutive promoters. Noteworthy is that the 
expression of some essential genes should be under the control of 
constitutively expressed positive transcription factors. In fact, we 
have identified the presence of approximately 100 species of E. coli 
transcription factor throughout cell growth. 

Marked variation of the promoter sequences listed in databases 
must have been arisen from several different factors: 1) the list of 
RpoD-dependent promoters include those recognized by RpoS 
(and possibly other minor sigma factors), of which the promoter 
recognition properties overlaps with RpoD [54,55]; 2) the list also 
includes promoters that are functional in vivo only under the 
support of positive TFs [56-58]; and 3) a variety of promoter-like 
sequences have been identified in silico to be promoters. In fact 
approximately 60% of the RpoD holoenzyme-binding sites are 
located inside open reading frames (see Fig. 1A). Binding of RNA 
polymerase on some open reading frames has been recognized 
[59]. The unexpected high number of RpoD holoenzyme-binding 
sites inside open reading frames raises a possibility of an as yet 
unidentified functional role(s) for the RpoD holoenzyme. These 
promoter-like sequences may contribute transcription initiation 
from internal promoters and/ or blocking the migration of 
elongation complexes. Clustering of promoter-like sequences 
within the E. coli genome were predicted by Collado-Vides and 
colleagues [60] while Ozoline and colleagues proposed the 
presence of as many as 78 'Promoter Islands' [59,61]. These 
promoter-like sequences could form transcriptionally inactive 
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Table 3. Constitutive Promoters (Type-B Spacers) (Rightward transcription). 



Map 


Left Operon 


Left Gene 


D 


RpoD 


D 


Right Gene 


Right Operon 


Promoter sequence 


0.37 


sokC 


sokC 


> 


HNS 


> 


nhaA 


nhaAR 


T T AACC - 1 7 -TCTAAT 


0.61 


dapB 


dapB 


> 




> 


carA 


carAB 


TTGACT-1 7-CAGAAT 


0.99 


uaaLI 


yaaU 


> 




> 


kefF 


kefFC 


TTGACT-1 6-TATGAC 


2.27 


(mraZ- -IpxC) 


ftsZ 


> 




> 


IpxC 


IpxC 




2.76 


tpdhR- -Ipd) 


aceF 


> 




> 


Ipd 


Ipd 


T T TAAA- 1 7 -TAAAAT 


3.55 


hrpB 


hrpB 


> 




> 


mrcB 


mrcB 


TTGAGA- 1 7 -TGTAAC 


3.61 


mrcB 


mrcB 


> 




> 


fhuA 


fhuACDB 


TTGCGA-1 8-TATTAT 


3.81 


dcA 


dcA 


> 




> 


erpA 


erpA 


TAGATA- 1 9 -TAG AAT 


4.33 


(bamA- -IpxA) 


IpxD 


> 




> 


fabZ 


(bamA- -IpxA) 


TCGCCA-15-TCTCAT 


4.36 


(bamA- -IpxA) 


fabZ 


> 




> 


IpxA 


(bamA- -IpxA) 


Internal Promoter 


5.41 


dinB-yafNOP 


dinB 


> 




> 


yafN 


yafNOP 




5.62 


(proBCj 


proA 


> 




> 


thrW 


thrW 


TTGACG-1 5-TTTAAC 


7.15 


betT 


betT 


> 


HNS 


> 


yahA 


yahA 


TTGATC-1 6 - TAT AAT 


8.44 


yaiU 


yaiU 


> 


HNS 


> 


yaiV 


yaiV 


TTCACT-18 -TTTA AT 


8.73 


yail 


yail 


> 


HNS 


> 


aroL 


aroL -yaiA -aroM 


TCGAAA- 1 7 -TATGAT 


9.33 


(nrdR- -pgpA) 


ribD 


> 




> 


ribE 


(nrdR- -pgpA) 


Internal Promoter 


9.79 


bloA 


bolA 


> 




> 


tig 


tig 


TCGACT-1 1-TA TAA T 


9.83 


tig 


tig 


> 




> 


dpP 


dpPX 


T TGAAA- 1 7 - CAT AAC 


10.08 


cof 


cof 


> 




> 


ybaO* 


ybaO 


TTGTCG-17 -TAAAAT 


10.70 


adk 


adk 


> 




> 


hemH 


hemH 


TTATCA-15-GATATT 


11.26 


tybbAP) 


ybbP 


> 


HNS 


> 


rhsD 


rhsDC-yibH 


T T AAT A- 1 7 -TGTAAT 


11.36 


(rhsD- -ylbH) 


ylbH 


> 




> 


ybbD 


ybbD 




11.38 


ybbD 


ybbD 


> 




> 


ylbl 


ylbl 


TCGTCA- 1 9 -TAAAAT 


12.23 


(renD-emrE) 


emrE 


> 


HNS 


> 


ybcK 


ybcK 


G T GAC C - 1 7 - T AAAAA 


12.24 


ybcK 


ybcK 


> 


HNS 


> 


ybcL 


ybcLM 


GTGGCA- 1 7 - TACAAT 


12.29 


ybcLM 


ybcL 


> 




> 


ybcM 


(ybcLM) 


Internal Promoter 


12.55 


tfaX 


tfaX 


> 


HNS 


> 


appY* 


appY 


TTATCA- 1 7 -TTTAAT 


13.06 


sokE 


sokE 


> 




> 


hokE 


hokE 




13.56 


(entcEBAH) 


entH 


> 




> 


cstA 


cstA 


T T TAC A- 1 5 -TAAATT 


14.15 


pagP 


pagP 


> 




> 


cspE 


cspE 


TGGACA- 1 7-TGTACT 


14.57 


ybeR-djiB 


ybeR 


> 




> 


djIB 


(ybeR-djiB) 


Internal Promoter 


15.84 


(ybfOC) 


ybfC 






> 


ybfQ 


ybfQ 


TTTTCA- 1 7 -AATACT 


16.61 


ImngAB) 


mngB 


> 


HNS 


> 


cydA 


cydAB 


TCTACA- 1 7 -TATATT 


17.49 


(biobFCD) 


bioD 


> 




> 


uvrB 


uvrB 


TTGGCA- 1 7 -TAAAAT 


17.64 


tmoaABCDE) 


moaE 


> 




> 


ybhL 


ybhL 


TGCACA- 17- TATCCT 


17.65 


ybhL 


ybhL 


> 


HNS 


> 


ybhM 


ybhM 


ACGACA- 1 6 -TATAAA 


19.21 


(ybjC- -ybjN) 


nfsA 


> 




> 


rimK 


rimK-ybjN 




20.23 


(lolA-rarA) 


rarA 


> 




> 


serS 


serS 


TGGCCA- 1 7 -GATAAG 


20.68 


(serC-aroA) 


aroA 


> 




> 


ycaL 


ycaL 


TTGATA- 1 7 -ATTAAT 


23 37 


{ eT&L/D/ 


efeB 




HNS 




phoH 


phoH 


rprprp7\rp7\ -i f~ T™ A T rt 1 1 

1 1 1A1A-1 b— 1 A 1 A 1 1 


24.80 


trpmF- -fabHDG) 


fabG 


> 




> 


acpP 


acpP-fabF 


TTGCAA- 1 6 -TACACT 


24.84 


(acpP-fabF) 


fabF 


> 




> 


pabC 


pabCG- -ycfH 


CTGCCA- 15 -GATAAG 


25.12 


(hinT- -ycfPj 


ycfP 


> 


HNS 


> 


ndh 


ndh 


CTCACA- 1 7 -AACAAT 


25.92 


(ymfTL- -ymfS) 


ymfL 


> 




> 


ymfM 


(ymfTL- -ymfS) 


Internal Promoter 


26.07 


pinE 


pinE 


> 




> 


mcrA 


mcrA 


TTGTCG-1 7 -ATTAAT 


26.19 


ycgZ- -ymgC 


ycgZ 


> 




> 


ymgA 


(ycgZ- -ymgC) 


Internal Promoter 
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Table 3. Cont. 



Map 


Left Operon 


Left Gene 


D 


RpoD 


D 


Right Gene 


Right Operon 


Promoter sequence 


26.20 


(ycgZ- -ymgC) 


ariR 


> 




> 


ymgC 


(ycgZ- -ymgC) 


Internal Promoter 


26.22 


(ycgZ- -ymgC) 


ymgC 


> 


HNS 


> 


ycgG 


ycgG 


TTGACG-1 9-TATTTT 


26.27 


ymgF 


ymgF 


> 


HNS 


> 


ycgH 


ycgHI 


TTGACA- 1 9 -TATAAG 


28.00 


ychE 


ychE 


> 


HNS 


> 


oppA 


oppABCDF 


TTAACA- 1 7 - AAGAAT 


28.51 


(yciVOQ) 


yciO 


> 




> 


yciQ 


(yciVOQ) 


Internal Promoter 


28.71 


topA 


topA 


> 


HNS 


> 


cysB* 


cysB 


TTCACA- 1 5 -TATAAA 


28.84 


pgpB 


pgpB 


> 




> 


yciS 


yciSM 


TTGATT - 1 8 - AATCAT 


31.32 


(paaAB- -GHIJK) 


paaB 


> 




> 


paaC 


(paaAB- -GHIJK) 


Internal Promoter 


31.44 


tpaaAB- -GHIJK) 


paaJ 


> 




> 


paaK 


tpaaAB- -GHIJK) 


Internal Promoter 


31.63 


insl-2 


insl-2 


> 




> 


ydbC 


ydbC 


TTAACA-1 7-TCGAAT 


32.89 


yncH 


yncH 


> 


HNS 


> 


rhsE 


rhsE 


TTGACT -17- TATTAC 


34.53 


(IsrACD- -tarn) 


IsrD 


> 




> 


IsrB 


(IsrACD- -tarn) 


Internal Promoter 


35.16 


ydfK 


ydfK 




HNS 


> 


pinQ 


pinQ 


T GTACA- 1 6 - AATAAT 


35.51 


dicF 


dicF 


> 




> 


dicB 


dicB- -insD-intQ 




35.98 


ynfM 


ynfM 


> 


HNS 


> 


asr 


asr 


GTCACA- 1 8 -TGTAAT 


36.26 


(rstAB) 


rstB 


> 




> 


tus 


tus 


TGGTCA- 1 7 -TATAAA 


37.36 


ydhO 


ydhO 


> 




> 


sodB 


sodB 


TTGCTA-16-AATAAG 


37.49 


ydhC 


ydhC 


> 




> 


cfa 


cfa 


C T AAC A- 1 7 - TGAAAT 


38.12 


rprA 


rprA 


> 




> 


ydiL 


ydiL 


CTGATA-15-TATTGT 


38.53 


aroH 


aroH 


> 


HNS 


> 


ydiE 


ydiE 


T TGAT A- 1 6 - TAT C AA 


38.97 


yniC 


yniC 


> 




> 


ydjM 


ydjM 


CTGAAA- 1 7 -ATTAAT 


38.99 


ydjM 


ydjM 


> 




> 


ydjN 


ydjN 


ATGACT-1 6 -AATAAT 


39.26 


nadE 


nadE 


> 




> 


cho 


cho 


TTGTCA-1 5-TTTAAT 


41.46 


holE 


holE 


> 




> 


yobB 


yobB-exoX 


TATACA- 16- CAT AAC 


43.96 


(yedVZ) 


yedZ 


> 




> 


zinT 


zinT 


TTGTCA-1 8 -AATAAT 


44.29 


amn 


amn 


> 


HNS 


> 


yeeN 


yeeN 


TAGACG- 1 8-771 TAA T 


44.60 


yeeP 


yeeP 


> 




> 


isrC 


isrC 


TTGTCC-17-TAGAAT 


44.60 


isrC 


isrC 


> 




> 


flu 


flu 




46.62 


(mdtA- -baeSR) 


baeR 


> 




> 


yegP 


yegP 


CTGGCA-17-CATACT 


48.45 


yeiH 


yeiH 


> 




> 


nfo 


nfo 




48.47 


nfo 


nfo 


> 




> 


yeil 


yeil 




49.23 


(yejLM) 


yejM 


> 




> 


proL 


proL 


TTGCAA- 1 6 -TAGTAT 


49.61 


yojO 


yojO 


> 


HNS 


> 


eco 


eco 


GCGACA- 1 5 - TATAAA 


51.29 


rbn 


rbn 


> 




> 


elaD 


elaD 


TTAAAA-18-TGTTAT 


54.54 


cysZ 


cysZ 


> 




> 


cysK 


cysK 


ATGTCA- 1 6 -TATAGA 


55.21 


(yffOPj 


yffP 


> 




> 


yffQ 


yffQR 


CTCACA-16-TATCAC 


55.23 


(yffQR) 


yffR 


> 




> 


yffS 


yffS 




56.02 


bcp 


bcp 


> 




> 


hyfA 


hyfAB- -GHIJR 


AT GAC C - 1 7 - C AG AAT 


56.02 


hyfAB- -GHIJR 


hyfA 


> 




> 


hyfB 


(hyfAB- -GHIJR) 


Internal Promoter 


57.13 


sseA 


sseA 


> 


HNS 


> 


ryfA 


ryfA 


TTGTCA-1 6-TATTGT 


58.93 


bamD 


bamD 


> 




> 


raiA 


raiA 


CTGTCA- 1 8 -TTTAGT 


59.27 


nadK 


nadK 


> 




> 


recN 


recN 


TTTACG-17 -TATAAA 


59.35 


smpB 


smpB 


> 




> 


ssrA 


ssrA 


TGGTCA- 1 8 -TATACT 


59.59 


(rnlAB) 


rn/B 


> 






yfjP 


yfjPQ 


T TGAAA- 1 5 -TATCAT 


59.79 


yfjW 


yfjW 


> 




> 


yfjX 


yfjXYJZ-ypjF 


TTGGCA-19 -TATAAA 


59.82 


(yfjXYJZF) 


ypjF 


> 




> 


psaA 


psaA 


CTGACA-1 5-AACAAT 
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Map 


Left Operon 


Left Gene 


D 


RpoD 


D 


Right Gene 


Right Operon 


Promoter sequence 


60.31 


ygaM 


ygaM 


> 




> 


nrdH 


nrdHIEF 


T CAACA- 1 8 -TATCAT 


60.46 


(proVWX) 


proW 


> 




> 


proX 


IproVWX) 


T T ATC A- 1 6 - AAT AAC 


60.51 


(ygaXY) 


ygaY 


> 




> 


ygaZ 


ygaZH 


TTAAGA-17- TATAAT 


60.92 


(srIAEBD) 


srID 


> 




> 


gutM* 


gutM-srIRQ 




61.59 


mutS 


mutS 


> 


HNS 


> 


pphB 


pphB 


TTAACG- 1 7 - TAAAAA 


64.43 


ygeF 


ygeF 


> 


HNS 


> 


ygeG 


ygeG 


TTTAAA- 1 7 - TAT C AA 


64.49 


ygel 


ygel 


> 




> 


pbl 


pbi 


TTGACC-16-GATACT 


65.32 


yqfC 


yqfG 


> 




> 


idi 


idi 


TTGTCG-18-AATCAT 


65.82 


zapA 


zapA 


> 




> 


ssrS 


ssrS-fau 




65.83 


(ssrS-fau) 


fau 


> 




> 


sibC 


sibC 


TTGACAA 5-CCTAAT 


66.49 


metK 


metK 


> 


HNS 


> 


galP 


galP 


AT AAC A- 1 8 -TATAAC 


67.93 


metC 


metC 


> 




> 


yghB 


yghB 


TGGACA- 1 5 -TATTGT 


68.61 


yqiC 


yqiC 


> 


HNS 


> 


ygiL 


ygiL 


TCGATA- 1 7 -TATAAA 


69.10 


(ttdABT) 


ttdB 


> 




> 


ttdT 


(ttdABT) 


Internal Promoter 


69.42 


ebgR 


ebgR 


> 




> 


ebgA 


ebgAC 


TTGCCG-15-TATTTT 


69.55 


ygjJK 


ygjJ 


> 




> 


ygjK 


(ygjJK) 


Internal Promoter 


69.79 


aix 


alx 


> 




> 


sstT 


sstT 


CTGACC-17-TGTCAT 


69.93 


exuT 


exuT 


> 




> 


exuR* 


exuR 


TTTTCA-1 6-TAAACT 


69.96 


exuR 


exuR 


> 




> 


yqjA 


yq/A-mzrA 


TTGTCT-1 7 -TATAAA 


70.39 


tdcR 


tdcR 


> 




> 


yhaB 


yhaBC 


TTGATA-1 9-GAAAAT 


70.79 


(agaS-agaBCDI) 


agal 


> 


HNS 


> 


yraH 


yraHl 


TTGATA-1 7-TCAAAT 


71.84 


ispB 


IspB 


> 




> 


sfsB* 


sfsB 


TTTAGA-1 8-TATAGT 


72.93 


argR 


argR 


> 


HNS 


> 


yhcN 


yhcN 


TTGAAA- 1 8 -AATAAC 


73.46 


(panf-prmA) 


prmA 


> 




> 


dusB 


dusB-fis 


GTGCCA-18-AAAAAT 


75.03 


yheST 


yheS 


> 




> 


yheT 


(yheST) 


Internal Promoter 


75.35 


(nirBDC-cysG) 


cysG 


> 


HNS 


> 


yhfL 


yhfL 


TTAACG-1 9 - TATAAT 


75.39 


yhfL 


yhfL 


> 




> 


frIA 


frIABCDR 


CTGACA-1 8-TTTAAT 


76.26 


yhgF 


yhgF 


> 


HNS 


> 


feoA 


feoABC 


TTATCA-15 -TTTAAT 


77.16 


yhhZA 


yhhZ 


> 




> 


insA-6 


insAB-6 


TTGAAA- 1 7-TTTAAT 


77.20 


(insAB-6) 


insB6 




HNS 


> 


yrhD 


yrhD 


TAGAGA- 1 8 -TATATT 


78.71 


sIp-dctR 


sip 




HNS 


> 


dctR 


dctR 


T T AAT A- 1 7 -TATTAT 


80.65 


yiaK 


yiaK 


> 




> 


yiaL 


(yiaK- -sgbHUE) 


Internal Promoter 


81.13 


yibA 


yibA 


> 




> 


yibJ 


yibJ 




81.15 


yibJ 


yibJ 


> 




> 


yibG 


yibG 




81.20 


yibV 


yibV 






> 


yibU 


yibU 


T T AAC T - 1 5 - G ATA AT 


81.79 


(rfaD-waaFCL) 


waaC 


> 




> 


waaL 


(rfaD-waaFCL) 


Internal Promoter 


82.24 


yicC 


yicC 


> 




> 


dinD 


dinD 


GTGAGA- 1 5 -TATAAA 


82.34 


dinD 


dinD 


> 




> 


yicG 


yicG 


TTATCA-16-AAAAAT 


82.48 


xanP 


xanP 


> 




> 


yicH 


yicH 




85.10 


ilvLXGMEDA 


HvL 


> 




> 


ilvX 


(ilvLXGMEDA) 


Internal Promoter 


85.43 


trxA 


trxA 


> 




> 


rhoL 


rhoL-rho 


TTGACT-1 7-TATTAA 


85.74 


(rfe- -tffT- -rffM) 


rffM 


> 




> 


yifK 


yifK 


AT TAC A- 1 5 -TTTAAT 


87.25 


polA 


polA 


> 




> 


spf 


spf 


CTGTCA-17-TAGAAA 


87.47 


typA 


typA 


> 




> 


yihL 


yihLM 




88.49 


fieF 


fieF 


> 




> 


pfkA 


pfkA 




90.08 


(rplKAJL-rpoBC) 


rpIL 


> 




> 


rpoB 


(rplKAJL-rpoBCI 


TAGTCA- 1 5 - TGT AAG 


91.98 


aphA 


aphA 


> 




> 


yibQ 


yibQR 
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Map 


Left Operon 


Left Gene 


D 


RpoD 


D 


Right Gene 


Right Operon 


Promoter sequence 


92.52 


(nrfABCDEFG) 


nrfG 


> 


HNS 


> 


gltP 


gltP 


ATGCCA-1 8-TATTAT 


94.29 


ecnB 


ecnB 


> 




> 


sugE 


sugE 


TTGAAA- 1 6 -CAAAAT 


94.89 


yjeT 


yjeT 


> 




> 


purA 


purA 


CTGAAA-1 9-TTTAAG 


95.01 


(nsrR-rnr- -yjflJ) 


rlmB 


> 




> 


yjfl 


(nsrR-rnr- -yjflJ) 


TTAATA- 1 7 - TGGAAT 


95.35 


(rpsF-priB- -rpll) 


rpsR 


> 




> 


rpll 


(rpsF-priB- -rpll) 


Internal Promoter 


97.06 


(yjhBC) 


yjhC 


> 




> 


ythA 


ythA 




97.85 


fimB 


fimB 


> 


HNS 


> 


fimE 


fimE 


TTGTAA- 1 7 -C ATA AT 


98.12 


(uxuAB) 


uxuB 


> 




> 


uxuR* 


uxuR 


GTGCCA- 1 7 -TATAGT 


98.51 


yjiS 


yjiS 


> 


HNS 


> 


yjiT 


yjiT 


TTGAGA- 1 8 -TATAAA 


99.35 


osmY 


osmY 


> 




> 


ytjA 


ytjA 




99.36 


ytjA 


ytjA 


> 




> 


yjju 


yjju 




99.52 


(deoCABD) 


deoA 


> 




> 


deoB 


(deoCABD) 


Internal Promoter 


99.57 


(deoCABD) 


deoD 


> 


HNS 


> 


yjjJ* 


yjp 


TTTTCA-1 8-TCTATT 


99.57 


yjjJ 


yjp 


> 


HNS 


> 


yjtD 


yjtD 


TTGTCG- 1 7 -AATTAT 










HNS 






154 (b) 


271 (a) 










38 






9 TFs 


1.76 (a/b) 










25% 






17 essential 


62 Y-genes 



Among the total of 1,075 RpoD holoenzyme-binding sites, 154 are located within type-B spacers upstream of right-side genes and downstream of lest-side genes, 
indicating that these promoters direct rightward transcription. The genes and operons under the control of these 1 54 promoters were estimated, of which 1 8 represent 
putative internal promoters. A total of 9 genes encoding transcription factors are indicated by star symbols (*). The essential genes listed in the PEC database are 
underlined within the operons. The promoter sequence with complete match with the canonical promoter (see Fig. 4) is shown in bold and italic while the promoter 
sequence with 5-out of-6 match is shown in bold. The spacers including H-NS binding sites are marked as HNS in the spacer column. Total number of genes under the 
control of 154 promoters were 271 (1.76 gene per promoter). Among the total of 154 RpoD holoenyme-binding sites, 38 (25%) overlap with the H-NS-binding sites. 
doi:1 0.1 371 /journal.pone.0090447.t003 



complexes with RNA polymerase but might contribute to increase 
local concentrations of RNA polymerase on the genome. 

Some of the constitutive promoters with high-level of conser- 
vation of the consensus sequence are located within long spacers 
with long UTR sequences but lacking protein-coding sequences of 
reasonable sizes. In these cases, it would be worthwhile to test as 
yet unidentified regulatory sRNAs [62]. Noteworthy is that such 
long spacers including the constitutive promoters often correspond 
to the Promoter Islands. 

The consensus sequence of constitutive promoters 

The canonical model of RpoD promoters consisting of two 
hexanucleotide sequences, TTGACA -35 signal and TATAAT - 
10 signal, each being separated by 17 bp linker was originally 
identified using in vitro transcription of some model templates by 
purified RNA polymerase [10,11]. Here we identified the 
consensus sequence of as many as 669 constitutive promoters. 
The most significant feature of constitutive promoters is the high- 
level conservation of canonical TTGACA(-35)-17bp-TATAAT(- 
10) sequence. We also identified the roles and -35 and -10 signals 
and the conservation level of each base within these two 
hexanucleotide sequences. The promoter -35 TTGACA signal 
plays a key role in binding the RNA polymerase (see Fig. 4) but G 
at position 3 and C at position 5 are relatively less important for 
the constitutive promoters, suggesting that both play roles in TF- 
depending inducible promoters. The promoter -10 TATAAT 
signal plays a major role in promoter opening (see Fig. 4) in 
agreement with the previous proposals [63,64], but the novel 
finding is that all six bases of TATAAT are equally important for 
this -10 signal function, supporting the hypothesis that the 



cooperativity of energy threshold, but not interaction of individual 
bases with RpoD, are important factors guiding the dynamics and 
selectivity of promoter open complex formation [65] . 

Regulation of the conserved promoter 

The constitutive promoters must be repressed in cases when the 
genes under their control are not necessary. A group of silencing 
proteins play roles in preventing the potentially harmful effects of 
uncontrolled expression of the constitutive promoters. In E. coli, 
the H-NS family proteins are the major players of anti-silencing 
[44,45,66]. The high-level of overlapping was observed in the 
distribution between the constitutive promoters (see Tables 1, 2 
and 3) and the binding sites of silencer H-NS [45] . A total of 203 
(30%) of the constitutive promoters were predicted to be under the 
control of H-NS silencer, but silencing proteins for other 70% 
promoters remain unidentified. The spectrum of silencing targets 
by H-NS should be modulated by interaction with the members of 
Hha/YdgT family of small-sized co-regulators [67,68]. Possible 
involvement of other growth condition-specific nucleoid proteins 
in silencing such as Dps (DNA-binding protein in starved cells) 
[69] and Dan (DNA-binding protein under anaerobic conditions) 
[32] awaits further studies. 

It should be noted that the constitutive promoters are also 
subject to activation by positive TFs for enhanced expression of 
the target genes. For instance, the csgD gene encoding the master 
regulator of biofilm formation carries one of the Type-A 
constitutive promoters, which is controlled by as many as 20 
positive and negative TFs [22,33]. The level of constitutive csgD 
promoter is higher than that of csgB promoters directing 
transcription toward opposite orientation (see Fig. 2A[b]), but 
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Type-A 




Figure 2. RpoD holoenzyme-binding peaks within type-A spacers. RpoD holoenzyme-binding peaks were identified within a total of 177 
type-A spacers. Some representative patterns of RpoD holoenzyme-peaks are shown, which are located between leuL-leuO (a), csgD-csgB (b), nanC- 
fimB (c), IrhA-alaA (d), adhE-ychE (e), waaQ-waaA (f), mdoC-mdoG (g) and ulaG-ulaA (h). Distribution of promoter -35 (indicated by orange arrows) and 
-10 (indicated by green arrows) signals is shown in the panel under each SELEX pattern. 
doi:1 0.1 371 /journal. pone.0090447.g002 



once the regulator CsgD is produced, it antagonizes H-NS silencer 
and the expression of the csgBAC operon is markedly enhanced, 
leading to production of curli fimbriae for biofilm formation. 

In conclusion, we classified E. coli promoters into the 
constitutive promoters recognized by RNA polymerase holoen- 
zyme alone and the transcription factor-assisted inducible 
promoters. Using the information of RpoD holoenzyme-binding 
sites identified by Genomic SELEX screening system, we 
predicted a total of 669 constitutive promoters with high-level 
conservation of the promoter consensus sequences. This finding 
indicates that the majority of hitherto identified promoters 
represent the TF-dependent inducible promoters. 

Materials and Methods 

Bacterial strains and plasmids 

E. coli K12 W3350 type-A [70] was used for purification of 
RNA polymerase and the template DNA for Genomic SELEX 
screening of RpoD promoters. E. coli BL21(DE3) was used for the 
expression and purification of sigma and core enzyme subunit 
proteins. Expression plasmids for the core enzyme subunits 
(pRpoD, pRpoA, pRpoB and pRpoC) and all seven sigma 
subunits (pRpoD, pRpoS, pRpoN, pRpoH, pRpoF, pRpoE and 
pFec) were constructed by ligating the respective coding sequenc- 
es, which were prepared by PCR amplification of the E. coli K12 
W3350 type-A genome DNA as template, into pET2 1 expression 



vector essentially according to the standard procedure used for 
expression of all seven sigma subunits and all 300 transcription 
factors in this laboratory [71,72]. 

Purification of core RNA polymerase 

RNA polymerase was purified from log-phase cells of E. coli K- 
12 W3350 by the standard procedure [26]. Separation of the core 
and holoenzymes was performed by passing the purified RNA 
polymerase through P 1 1 phosphocellulose column in the presence 
of 50% glycerol [26]. To remove trace amounts of the core 
enzyme-associated sigma factors, the purified RNA polymerase in 
the storage buffer containing 50% glycerol was dialyzed against 
the same buffer but containing 5% glycerol and fractionated by 
phosphocellulose column chromatography in the presence of 5 % 
glycerol [26]. The level of remaining sigma factors was less than 
0.1%, if any, as checked by immuno-staining with antibodies 
against each of seven sigma factors. 

Purification of core and sigma subunits 

The core enzyme subunits (RpoA, RpoB, RpoC and RpoZ) 
were expressed using the respective expression plasmids and 
purified by two cycles of column chromatography through DEAE 
(DE52) and Pll phosphocellulose [26]. Sigma subunits were 
expressed and purified by ion-exchange column chromatography 
through DE52 and PI 1 followed by Sephacryl S-300 gel filtration 
column [26]. The purified sigma and core subunit proteins were 
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Type-B 




Figure 3. RpoD holoenzyme-binding peaks within type-B spacers. RpoD holoenzyme-binding peaks were identified within a total of 315 
type-B spacers. Some representative patterns of RpoD holoenzyme-peaks are shown, which are located, which include the constitutive promoters for 
cydA (a), yfcV (b), yobF (c), ompT (d), yjeJ (e), yhcN (f), yfjL (g) and phoH (h) operons. Distribution of promoter -35 and -10 is shown below each panel. 
doi:1 0.1 371 /journal.pone.0090447.g003 



more than 99% pure as judged by protein staining of SDS-PAGE 
gels. 

Preparation of antibodies 

Antibodies against RpoD sigma and core enzyme subunits were 
produced in rabbits by injecting purified sigma proteins. The 
protocol for antibody production was raised following the Ethical 
Guidelines proposed by the Science Council of Japan and the 
Japanese Government, and approved by the Committee on the 
Ethics of Animal Experiments in the Animal Laboratory of 
Mitsubishi Chemical Medience Co. (Uto, Kumamoto, Japan). 
Antibodies against each RNA polymerase proteins were produced 
in two rabbits, and after examination of antibody activity using 
immune-blot analysis, the batch of higher activity was used in this 
study. Anti-RpoD, anti-RpoC used in this study did not cross-react 
with each other. 

Genomic SELEX screening of RpoD holoenzyme-binding 
sequences 

The genomic SELEX method was carried out as previously 
described [21]. A mixture of DNA fragments of the E. coli K-12 
W3 1 1 0 genome was prepared after sonication of purified genome 
DNA, and cloned into a multi-copy plasmid pBR322. In each 
SELEX screening, the DNA mixture was regenerated by PCR. 
For SELEX screening, 5 pmol of the mixture of DNA fragments 
and 10 pmol RNA polymerase RpoD holoenzyme were mixed in 
a binding buffer (10 mil Tris-HCl, pH 7.8 at 4°C, 3 mM 
magnesium acetate, 150 mM NaCl, and 1.25 mg/ml bovine 



serum albumin) and incubated for 30 min at 37°C. The DNA- 
RNA polymerase mixture was treated with anti-RpoC antibody 
and DNA fragments recovered from the complexes were PCR- 
amplified and subjected to next cycle of SELEX for enrichment of 
RNA polymerase-bound DNA fragments. 

For SELEX-chip analysis, DNA samples were isolated from the 
DNA-protein complexes at the final state of SELEX, PCR- 
amplified and labeled with Cy5 while the original DNA library 
was labeled with Cy3. The fluorescent labeled DNA mixtures were 
hybridized to a DNA microarray consisting of 43,450 species of 60 
b-long DNA probe, which are designed to cover the entire E. coli 
genome at 105 bp interval (Oxford Gene Technology, Oxford, 
UK) [27-29]. The fluorescent intensity of test sample at each 
probe was normalized with that of the corresponding peak of 
original library. After normalization of each pattern, the Cy5/Cy3 
ratio was measured and plotted along the E. coli genome. 

Mixed transcription assay in vitro 

The promoter sequence of 195 bp-long lacUV5 template and its 
3' truncated 175 bp-long DNA was modified to the consensus 
sequence, TTGACA(-35)-17 bp-TATAAT(-lO), each producing 
run-off transcripts of 42 and 22 nucleotides in length, respectively. 
Starting from these DNA fragments with the ideal promoter, the 
complete set of variant consensus promoters, each containing one 
base substitution at all positions of the hexanucleotide sequences of 
promoter -35 and -10, was prepared as described previously [23]. 
The test promoters of variant consensus promoters directed the 
synthesis of 42 b-long RNA while the reference template 
containing the original consensus promoter directed the synthesis 
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WEfffffffl Parameter I Parameter II 




123456 123456 



Figure 4. Determination of the consensus sequence of constitutive promoters using the in vitro mixed transcription system. Mixtures 
of equal amounts of 1 95 bp-long template containing the ideal promoter of complete consensus sequence and 1 75 bp-long mutant template, each 
carrying one base substitution, were subjected to the in vitro mixed transcription [24,25]. After preincubation for 0.5, 1 .0, 2.5, 5.0, 7.5, 1 0 and 15 min, a 
mixture of substrates and heparin was added and RNA synthesis was allowed for 10 min. The final level of RNA synthesis represents the level of RpoD 
holoenzyme binding (parameter I) while the rate of open complex formation (parameter II) was determined as a reciprocal of the time required to 
reach the plateau level. For each set of four promoters with mutations at the same position, the promoter activities are shown as the values relative 
to the promoter with the highest activity. 
doi:10.1371/journal.pone.0090447.g004 



of 22 b-long RNA. The in vitro mixed transcription was performed 
under the standard single-round reaction conditions in the 
presence of [ 32 P]UTP as a labeled substrate [23—25]. In brief, a 
mixture of these two templates was preincubated with RpoD 
holoenzyme for various times (0.5, 1.0, 2.5, 5.0, 10, 15 and 
30 min) at 37°C in the standard transcription assay buffer, and 
then a mixture of substrates and heparin was added to allow a 
single-round transcription for 15 min. Heparin inactivates RNA 
polymerase that was not involved in open complex formation. 
RNA products were separated by electrophoresis on 10% PAGE 
containing 8.3 M urea. The amount of RNA was determined by 
measuring the intensity of P radioactivity and corrected for the 
U content of each transcript. 

Promoter sequence analysis 

The complete genome sequence of E. coli K-12 MG1655 
(U00096.2) was used for the promoter sequence analysis. The perl 
scripts for finding the candidate sequences of promoters include 
the following functions: 1) extraction of every 6 bases in the 
genome sequences with sliding 1 bp; 2) comparison of the 6 base 
sequences with TTGACA (-35) and calculation of the score by 
setting one match scoring 1 point, one mis-match scoring 0 point, 
and without gap; 3) comparison of the 6 bases with TATAAT (-10) 
under the same conditions; and 4) extraction of a pair of -35 and - 
1 0 hexanucleotide sequences by setting the best-match score of 1 2 
(6 at -35 and at -10 signal); 6) extraction of a pair of -35 and -10 
signals with a spacer length of 1 7 plus/ minus 2, giving the score of 



+3 for 17 bp, +2 for 16 and 18 bp; and +1 for 15 and 19 bp, 
respectively. Thus, the maximum score for the best-match 
promoter in both -35 and -10 sequences separated by 17 bp 
spacer is 15. 

The Logo pattern analysis of promoter sequences was 
performed using the established sequence logo generator [73]. 

Supporting Information 

Table SI RNA polymerase RpoD holoenzyme-binding sites on 
the E. coli genome. A total of 1,075 RpoD holoenzyme-binding 
sites were identified within spacers on the entire E. coli K-12 
W3110 genome. The binding sites identified within intergenic 
spacers were classified into Type-A, Type-B and Type-C (see Fig. 2 
and Table 1). The constitutive promoters were predicted based on 
the location of the RpoD holoenzyme-binding sites. [SI A] 
Promoters within type-A spacers. The genes under the constitutive 
promoters on left- and right-sides of type-A intergenic spacers and 
their functions are described. The genes shown under filled 
background encodes DNA-binding transcription factors (TFs), 
each regulating a set of target genes. The spacers under grey 
background contain two distinct peaks of holoenzyme binding. 
The positions of these genes on the genome are described on the 
left-end column while the levels of SELEX-peaks relative to the 
highest peak are described on the right-end column. [SIB] 
Promoters within type-B spacers. The genes under the constitutive 
promoters on either left- or right-sides of type-B intergenic spacers 
and their functions are described. The positions of these genes on 
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Figure 5. The composition of constitutive promoters. [A] The known consensus sequences of RpoD-dependent promoter, TTGAAC (-35) and 
TATAAT (-10) separated by 17 plus/minus 2 bp, were searched for all type-A (177) and type-B (315) spacers (see Experimental Procedure for the 
analysis method). Most of the constitutive promoters carry high-levels of the consensus sequence as listed in Table 1 . The composition of promoter - 
35 and -1 0 sequence was classified into 8 groups based on the conservation level of consensus sequences. About 89% of type-A promoters and 82% 
of type-B promoters (or 86% of A- plus B-type promoters) contain the sequence higher than 4/6 agreement with the consensus sequence at both -35 
or -10 positions (A1), while only 39% of a total of 582 known promoters carries this level of consensus sequences (B1). [B] The whole set of 
constitutive promoter sequences were subjected to Logo analysis [73], The Logo patterns of -35 and -10 sequences are essentially the same among 
the constitutive promoters within type-A and type-B spacers. The Logo pattern of the whole set of constitutive promoters was compared with the 
Logo pattern generated using the total of 582 experimentally identified promoters [1 9,20]. The contribution of each base of the consensus -35 and - 
10 sequences is significantly different between the constitutive promoters and the set of known promoters. 
doi:1 0.1 371 /journal.pone.0090447.g005 



the genome are described on the left-end column while the levels 
of SELEX-peaks relative to the highest peak are described on the 
right-end column. The genes shown under black background 
encodes DNA-binding TFs. The spacers under grey background 
contain two distinct peaks of holoenzyme binding. 
(PDF) 

Table S2 Operons under the control of the constitutive 
promoters. Operons under the control of constitutive promoters 
are listed on the operon columns. Promoters shown by shaded 
background represent those listed in RegulonDB and EcoCyc 
databases. Spacers containing H-NS binding sites are indicated by 
HNS. [S2A] Operons under the control of constitutive promoters 
within type-A spacers. The constitutive promoters identified within 
type-A spacers direct bidirectional transcription of the operons 



indicated on left- or right-operon columns. The map positions of 
the promoter-proximal genes are indicated on the left- and right- 
end columns. [S2B] Operons under the control of constitutive 
promoters within type-B spacers. The constitutive promoters 
identified within type-B spacers direct transcription toward one 
direction. 
(PDF) 
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